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Kenneth Craik: The man and his work* 
O. L. Zangwill 


Those who have had the honour tc 5e invited to deliver the Craik Lecture — and a very 
great honour it is — usually begin with a eulogy of Kenneth Craik and then, very properly, 
proceed to talk about their own work for the rest of the afternoon. I am afraid I propose to 
depart from this convention. My entire lecture will be devoted to Kenneth Craik, and if on 
occasion I make passing reference to work of my own, it will simply be because he was in 
some way connected with it or it had a marginal relevance to his interests. For this breach 
with tradition I make no excuse. After all, what could be a more fitting subject for a Craik 
Lecture than Kenneth Craik himself? 

Although there are plenty of people still active today who remember Kenneth well, for 
many of my younger colleagues in experimental psychology he is something of a mythical 
figure. For of those who die young, it is not only poets whose lives become encrusted with 
myth and legend; it happens to scientists too, even if the elaborations of posterity are for 
the most part less colourful. In this respect, Kenneth Craik, who died in 1945 at the age of 
31, is no exception. Far from wishing to de-mythologize Kenneth Craik, I wish only to give 
some account of his work and personality as seen through the eyes of a contemporary. 
Everyone in Cambridge today knows Craik as the name of an annual lecture and of a 
University building — or more properly of half of one!! For me, it is the name of a colleague 
and friend. 

May I begin with a brief outline of his life? Kenneth Craik was born on 29 March 1914 
and educated at the Edinburgh Academy and Edinburgh University. At school he was a 
classic, I believe a very good one, and at university he read philosophy, where he came 
much under the influence of the distinguished Kantian scholar Norman Kemp Smith. For 
his subsidiary subject Kenneth chose psychology, and I have been told that he began 
making apparatus in the Psychological Laboratory even before he graduated. I may add 
that he also made other things, some of them more than a little frivolous, such as ships in 
bottles or the set of miniature steam engines, each one smaller than the last, which is now 
in the Royal Scottish Museum.* 

Having duly obtained his First, Craik was awarded the Shaw Fellowship open to 
graduates in philosophy of all four Scottish universities. Before becoming a research 
student at Cambridge he worked for a year in Edinburgh under Professor James Drever Sr, 
and my predecessor in office, Sir Frederic Bartlett, has told the dramatic story of how he 
first came to hear of Kenneth's existence.? Walking along a country road with Bartlett, 
Professor Drever suddenly remarked to him: ‘Next term I am going to send you a genius!’ 
He did, and although much later on Edinburgh University made some attempt to reclaim 
its genius, in the event Kenneth stayed on in Cambridge for the remainder of his life.* 

Bartlett vividly describes his first meeting with Kenneth. ' He came into my room,' he 
writes, ‘and my immediate impression was of a tall, rather powerful, spare frame; a face 
pale but full of life; a firm chin, straight mouth, singularly attractive dark eyes, and above a 
shock of black hair. From the beginning he was wholly “at home", as we say, with any 
amount of genuine modesty, but not a scrap of false humility. He knew, and in a very few 
minutes I knew, of the power that was within him.” 

For a time, Kenneth was 1n considerable doubt as to what his line of research should be. 
It has been said that his real ambition was to write a thesis concerned with the philosophy 
of psychology but of this I know nothing at first hand. I do, however, know that he 
considered quite seriously working on trial-and-error learning. Indeed the importance he 
* The Kenneth Craik Lecture delivered ın Cambridge on 10 March 1978 under the auspices of St John's College 
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attached to Thorndike’s work emerges clearly in several of his later writings. In the event, 
though, he settled for vision research. This decision undoubtedly owed much te his high 
regard for Dr A. F. Rawdon Smith, at that time the holder of a Beit Fellowship in the 
Psychological Laboratory and working mainly in the psycho-acoustic field. It was Rawdon 
(as he was always known) who suggested to Kenneth that he might work on a visual 
problem regarding which he had himself carried out a pilot experiment. This involved the 
measurement of the differential sensitivity of the eye with varying degrees of adaptation to 
the brightness of the illumination. Kenneth fell in with this suggestion and it led to several 
further experiments on visual adaptation which eventually formed the core of both his PhD 
thesis, submitted in 1940, and a slightly fuller dissertation on the same topic which won 
him his Fellowship at St John’s College the following year.® 

The advent of war markedly changed the pattern of Kenneth’s life and work. Ineligible 
for active service on account of a minor physical disability, his role throughout the war was 
essentially that of an incredibly busy applied research scientist. Professor Bartlett’s policy 
at the time was to turn the Department over virtually completely to war work and to make 
his staff available to take on a variety of problems of direct relevance to the war effort. 
Operational problems relating to human factors were brought in from all three Services, in 
particular the RAF, by the Medical Research Council and its committees, as well as by 
certain government agencies. Without wishing in any way to play down the key role of 
Bartlett himself, or indeed to minimize the contribution of many others who worked in his 
laboratory at that time, I think it is true to say that the whole success of Bartlett’s war-time 
policy hinged on one man, and that man was Kenneth Craik. I feel sure that Bartlett 
himself recognized this when, in 1944, the Medical Research Council agreed to establish an 
Applied Psychology Unit in his Department. With characteristic generosity, Bartlett 
stipulated that its Director should be not himself but Kenneth Craik.’ 

Right through this incredibly busy and responsible period in Kenneth's life, he spent 
virtually all day working in the laboratory, his experiments being interrupted only by 
committee meetings in London and frequent, often uncomfortable, trips by train to 
research establishments or operational bases elsewhere, many of them remote from 
Cambridge. Yet, writing almost entirely at night, he managed to produce one book and a 
substantial part of a second, together with a very large number of reports and papers, 
many of them for obvious reasons classified at the time.? Yet no matter how zaken up he 
was with immediate practical problems, basic scientific issues remained above all his 
abiding preoccupation.. 

Kenneth Craik died as a result of a road accident on 7 May 1945, only 2 days before the 
end of the war in Europe. Never an accomplished cyclist, he came into collision with the 
partly open door of a stationary car in King's Parade and was thrown from his machine 
and apparently struck by an approaching lorry. He never regained consciousness and died 
later the same night. Át the inquest, a verdict of accidental death was returned and no 
blame was attached to either driver. It is not without interest that no fewer than 10 
recorded presentiments of Craik's lethal accident are said to have been discovered among 
his papers.? He was unmarried. 


Visual adaptation 

Let us now go from the man to his work. As I said, Kenneth's earliest experiments were 
concerned with visual adaptation. I have a certain weakness for first experiments, much as 
some literary critics have for first novels, and would like to talk briefly about this one. I 
think it illustrates well Kenneth's ingenuity in design and construction of equipment, his 
competence as an experimentalist and his remarkable theoretical acumen. As Bartlett wrote 
of Kenneth: ‘I do not think he ever did an experiment, however simple and small it might 
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appear, which was not informed by some idea which took its issues at once into a wide 
field of principle.’ 

This experiment was concerned with the effects of adaptation on differential brightness 
sensitivity. Craik designed a simple though effective apparatus that provided adapting and 
test fields that could be independently varied in intensity from a value close to the 
threshold for the dark-adapted eye to a maximum of 4000 foot-lamberts. Light from a 100 W 
bulb fell directly on a ground-glass screen to provide an adapting field of intensity 7. A 
concave mirror placed behind the lamp focused the filament of the lamp on the ground-glass 
screen to provide the independent test. field of intensity AZ. The image of the ground-glass 
screen was focused on the pupil of the eye by a lens, so that the lens appeared uniformly 
filled with light - the Maxwellian view. Differential sensitivity for brightness (A///) was 
measured by varying the amplitude of slow constant flicker of the test field. The change 
from the steady adapting field to the flickering test field, as well as alterations in 
illumination and in size of test field, could be made rapidly by means of stops, filter frames 
and shutters linked to a crank on a rotating shaft. 


| 
i 


M á, 
: E ^ 
Ln ee i Xi "c mo 
SFN hEN Le ~*~ 


Experiments on Visual Adaptation. Kenneth Craik (left) in 1938. The subject is O. L. Zangwill. 





The results of this experiment clearly showed that differential brightness sensitivity is a 
function neither of the adapting field alone nor of the test field alone but of both together; 
the curves relating A/// over a wide range of light intensities above or below the adapting 
illumination are very similar. Craik suggested that the simplest explanation is to regard the 
adapting illumination as 'setting the eye' to a certain range of sensitivity and, except at 
very dim levels of illumination, differential sensitivity is keenest at an illumination equal to 
that of the adapting illumination. Analogous results were obtained in later experiments on 
the effects of adaptation on visual acuity and subjective brightness." 
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Kenneth had much of interest to say about this ‘range-setting’ property of visual 
adaptation, such that when the eye is adapted to any one illumination, it is sensitive to 
rapid variations in illumination over a certain range. If it is adapted to a new illumination, 
the whole range of sensitivity is shifted bodily in this direction. In this connection, Kenneth | 
cites Lythgoe's analogy of a multi-range ammeter giving full-scale deflection with different 
currents, according to the shunt employed. This, I believe, was the first instance of Craik's 
fertility in seeking analogues of human perception in the properties of man-made machines. 

It has been pointed out to me by Dr John Mollon that Craik may well have 
underestimated the degree of tuning of the adapted state. More recent work, he tells me, 
suggests that if he had tested with a very brief spatially differentiated target shortly after 
offset, he would have found a much narrower dynamic range around the current 
adaptation level. But Kenneth would have certainly modified his ideas about the 
range-setting properties of the eye in the light of new experimental findings, and the value 
of his Fellowship dissertation is certainly not limited to this aspect of his work. He devotes 
much thought to the spatial effects of adaptation, the relation of photochemical to neural 
processes and to the effects of mechanical and electrical excitation of the eye. Although 
much of his dissertation is inevitably dated, it still has a freshness and intellectual 
incisiveness hard to fault. True, his enthusiasm at times outruns his judgement, most 
notably in his overestimation of Selig Hecht's photochemical theories at that time much in 
vogue. Even so, he expresses on occasion distinct doubt as to how well Hecht's theoretical 
structure really held together in a satisfactory quantitative fashion. In this connection, it is 
apposite to note that in the copy of his Fellowship dissertation that he presented to our 
Departmental Library, there is a pencilled note in the margin of his discussion of Hecht's 
theory: ‘I now disagree with this and hold other views.’?? 

Craik's work on adaptation led him to undertake a number of further experiments on 
vision, mostly on a small scale and sometimes alarmingly personal. He did several 
experiments on pressure blindness with himself as subject, in the course of which he 
established beyond reasonable doubt the retinal origin of visual after-images. He further 
established that if the dark-adapted eye is light adapted while thus blinded, the subsequent 
dark-adaptation curve is exactly the same as if the eye had been light adapted while vision 


. was normal. He found, too, that dark adaptation is not delayed by pressure applied 


immediately after light adaptation, indicating that photochemical regeneration is 
unaffected. These results led him to conclude that light adaptation of both rods and cones 


©. is essentially a retinal process.!? 


The tendency to account for sensory phenomena wherever possible in terms of peripheral 
mechanisms was characteristic of Cambridge physiology of the period. Yet it did not deter 
Kenneth from concerning himself with higher-order aspects of visual perception. For 


^. example, he devoted much attention to the visual constancies, particularly brightness 


constancy, which he thought might depend in some degree upon processes akin to those 
operating in visual adaptation. He was ever a strong believer in the concept of levels of 
function in the central nervous system, yet ever anxious to seek continuity between higher 
and lower levels of nervous activity. As he put it, many relatively simple processes that 
characterize retinal activity ‘seem to pervade the further development of sensation and 
perception'. It was in elucidating these difficult issues of perception that he seemed to find 
his major challenge." 

As might be expected, Kenneth's interests in visual physiology brought him into sharp 
conflict with the Gestalt psychologists, at that time the leading group working on visual 
perception. The Gestalt psychologists were not only little interested in physiological issues 
but were convinced that peripheral factors had little, if any, importance for our 
understanding of visual perception. In particular, they urged that the visual constancies are 
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to be understood in terms of ‘forces of organization’ as foreign to modern physiology as 
was animal magnetism to eighteenth century physics. In the case of brightness perception, 
for example, they drew a sharp distinction between contrast effects, which they regarded as 
of purely retinal origin, and brightness constancy, which they attributed to central 
organization. Craik argued first that brightness constancy might owe its origin in part at 
least to visual adaptation and, even if it also involved higher processes, these might well be 
governed by similar general principles. As he saw it, visual adaptation is a process that 
makes possible a sensation corresponding more closely to the gradients of illumination 
between objects than to the absolute level of the illumination. It is the first stage in a 
corrective mechanism that facilitates recognition of objects under changing conditions of 
illumination.!* 

I was myself at that time somewhat seduced by Gestalt theory, largely as a result of 
having spent a semester in Professor Wolfgang Kóhler's laboratory at the University of 
Berlin before coming up to Cambridge. I was impressed by Koffka's Principles of Gestalt 
Psychology, which came out in 1935, and was interested in an idea he put forward in the 
book that psychophysical thresholds are influenced by visual configuration. Koffka 
suggested that the absolute threshold of a patch of light would be higher if it were 
projected within the contour of a closed line figure than if it were projected outside the 
contour or on a plain field. In the Gestalt view, this was because the cohesion of a figure 
segregated from its background exerts an inhibitory effect on the detection of irrelevant 
stimuli. I did the experiment and did indeed obtain the predicted result. Kenneth was 
interested and suggested a simple experiment whereby we might control more adequately 
for the effects of contrast. This we did together and obtained absolutely no difference in 
threshold attributable in any way to the presence of a closed figure: such increases in 
threshold as we ascertained could evidently be ascribed without remainder to the effects of 
contrast. Kenneth was clearly convinced — and I soon came to agree with him - that the 
differentiation of figure from background takes place at a level higher than that of contrast 
and that brightness thresholds are in consequence unaffected by Gestalt factors. We 
reported our results in a joint paper which Köhler, who had earlier shown some interest in 
our work, not altogether surprisingly refused to accept. Although this brusque rejection of 
our paper left us somewhat disenchanted with Gestalt theory, I never rejected it as 
wholeheartedly as did Kenneth. Indeed looking back on his criticisms of Gestalt theory 
with the advantage of hindsight, I must admit that Kenneth was perhaps a bit unfair to the 
Gestalt psychologists. After all, his own hypothesis of brain models paralleling external 
events which he developed a year or two later does have features in common with Kóhler's 
idea of isomorphism. For Kóhler, like Craik, postulated physical patterns of nervous 
activity which in their way were supposed to parallel external reality — or at all events 
external reality as it is represented in our phenomenal experience — even though his theory 
of psycho-neural isomorphism was developed in a very different way from Craik's theory of 
thought. At the same time, there can be no doubt that Kenneth's work on vision showed a 
far keener grasp of physiological principles than did that of the Gestalt school and his ideas 
have proved altogether more enduring.!* 

When the war came, Kenneth was inevitably much in demand to give advice on the 
solution of a large number of practical problems relating to vision, many of them concerned 
with dark adaptation. In connection with night gunnery, for example, he was led to devise 
a photometer of marvellous simplicity which I believe proved of considerable value in 
connection with night flying and tank warfare in the desert." He also did some brilliant 
work for the Flying Personnel Research Committee on the visual location of enemy 
submarines during day sweeps by Coastal Command. He at once appreciated that the 


problem was one of glare and evolved a simple experimental law relating intensity of glare. ^ 
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to angle of light source on the one hand and size of target on the other. This led him to 
suggest an improved method of scanning which worked well in practice and was 
immediately adopted. Kenneth was always quick to grasp the essentials of a practical 
problem and often succeeded in solving it as a result of his own impromptu experiments. 
Indeed, of the 60 or so unpublished reports and memoranda he wrote during the war, well 
over a half deal with visual problems. 

Needless to say, Kenneth was concerned with a wide variety of other applied problems, 
some of them involving vision only indirectly. Many of these were essentially tracking 
problems, which I shall deal with more fully later on. While the study of tracking gave rise 
to the great fascination which machines embodying automatic control principles exerted 
upon him, I do not think it ever wholly supplanted his first love, vision. The last tune I had 
a chance to talk to Kenneth, less than a month before his death, he told me that after the 
war he proposed to go back to work on visual perception, and instanced in particular the 
visual constancies. He went on to say that recent advances in visual physiology, along with 
the new ideas that had emerged in connection with servo-mechanisms, gave distinct hope 
that one might now take on these problems of perception with a reasonable chance of 
success. If only for this reason, it seems to me particularly appropriate that our newest 
citadel of vision research in Cambridge, in which neurophysiologists and experimental 
psychologists pursue their respective preoccupations in close relationship, should have been 
christened the Kenneth Craik Laboratory. ` 


The nature of explanation 


As I have said, when Craik first came to Cambridge he was still in some doubt as to his 
research project. Even at this early stage, it seems clear that he had the idea of a theory of 
mind and its relationship to mechanism at the back of his mind and this idea continued to 
haunt him for a number of years. In 1943 he published a short book, The Nature of 
Explanation, the only study of any length that he lived to complete.!? It is evident that, in 
spite of his manifest commitment to experimental psychology, his earlier philosophical 
preoccupations were still very much with him, even though in his attitude to philosophy he 
had become distinctly disillusioned. Indeed, the book begins with an almost despairing 
lament that traditional method in philosophy has signally failed to produce any generally 
acceptable explanation of mind and its place in nature. He views the present state of 
philosophy as profoundly discouraging and sees the possibility of progress only if ‘the 
current emphasis on verbal exactitude in definition’ is replaced by the ‘self-validating 
procedures of experiment and hypothesis’. In other words, if philosophy is to solve its 
perennial problems, it must refashion itself on the model of the natural sciences. 

Considerations of this kind lead Craik to examine the nature and function of 
explanation, both in science and in ordinary discourse. In successive chapters he considers 
a priorism and scepticism, causality and probability theory, and the philosophy of modern 
physics. Although the tone is for the most part critical, Craik’s lucid exposition bears all 
the marks of a widely read and well-trained Edinburgh philosopher. But as one reads on 
one discovers that the intention of the book is by no means wholly destructive. The early 
chapters are intended as no more than a prolegomenon to Craik’s own hypothesis of the 
nature of thought, a hypothesis which he evidently supposed — I suspect falsely — to be 
open to experimental verification. This he puts forward in what is quite certainly the most 
original and important chapter in the book. Some of its implications and consequences are 
then discussed in a most interesting and stimulating way. 

What, then, is Craik’s hypothesis? It takes its departure from the incontestably predictive 
character of thinking. Reduced to its simplest terms, a man observes some external event or 
process and arrives at some ‘conclusion’ or ‘prediction’, expressed in words or numbers, 
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which will come to pass if his reasoning has been correct. This process of reasoning 
involves, first, the translation of external events into words, numbers or other symbols; 
secondly, arrival at other symbols by a process of reasoning, deduction or inference; and 
thirdly, re-translation of these symbols into external processes, i.e. appropriate action, or at 
any rate recognizing the correspondence between the symbols and the outcome of such 
actions, such that the prediction has been fulfilled. Craik illustrates his argument by the 
example of building a bridge: we may proceed by trial and error, which is costly, 
time-consuming and liable to have disastrous consequences, as when the bridge we have 
built cannot bear the weight of the first train to pass over it. If, on the other hand, we first 
think out a satisfactory design and only then proceed to implement it, Craik contends that 
we are proceeding in a manner that closely parallels that of the machines which we might 
use to aid our thought. For example, the strains on a bridge can be represented on a 
machine by movements of toothed wheels representing numbers and by its mechanical 
operations. With these symbols, the machine can predict whether a bridge of a certain 
construction will stand or collapse. It is Craik's contention that a calculating machine, an 
anti-aircraft predictor and Kelvin's tidal predictor all possess essentially the same predictive 
capacity as man because they operate in essentially the same manner as man himself. 

This, then, is the essence of Craik's hypothesis on the nature of thought. His model is 
envisaged as a physical or chemical system which has a relation-structure similar to that of 
the process it imitates. By ‘relation-structure’ Craik means not some esoteric non-physical 
entity but an actual physical working model operating in the same way as the process it 
parallels. As he rightly points out, many of the greatest advances in modern technology 
have been instruments which extend the scope of our sense organs, our brains or our limbs. 
It is not possible, then, that the brain makes use of mechanisms similar in principle and 
that these mechanisms can parallel phenomena in the external world just as a calculating 
machine can parallel the development of strains in a bridge? Our thought, on this view, has 
objective validity precisely because it is not fundamentally different from external reality 
and specially suited for imitating it. 

As the late Lord Adrian put it, Craik seems to be saying that the organism carries in his 
head a small-scale model of the external world and of his own possible actions in regard to 
it.!° But this model, he adds, is evidently conceived in terms of physical — or better 
neurophysiological — mechanisms and there is no reason why it should involve mental 
events, such as images or conscious thoughts. Although rightly distrustful of introspection, 
Craik was certainly no behaviourist and had no wish to exclude mental events from 
scientific consideration. He was disposed to associate consciousness with the activity of 
some, thougn by no means of all, higher-order neural activities. For this reason, he 
preferred to speak of his hypothesis as hylozoistic rather than mechanistic, in so far as it 
accepts the reality of conscious processes, even though these are conceived to be wholly 
dependent on matter, i.e. on the activities of the brain. 

As I have said earlier, Craik insisted that no hard and fast line can be drawn between the 
adaptive reactions of the nervous system which do not involve consciousness and those that 
do. In this connection, he contrasts the effortless use of cues in perception, called by 
Helmholtz unconscious inference, and laborious conscious interpretation and the search for 
meaning. Further, as Hughlings Jackson realized, there is no hard and fast distinction 
between automatic and volitional motor activities: It is but a difference of degree. But 
unfortunately, Craik could offer no explanation of the nature and role of consciousness 
beyond the rather unconvincing suggestion that certain key features in nervous integration 
may have a physiological ‘definiteness’ with which correlated a psychological ‘definiteness’. 
This 'definiteness' in some way causes images to be generated in consciousness. Even 
though no one has as yet given a really satisfactory explanation of the origin and function 
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of mental imagery, Craik's suggestion seems too vague to be of any real hel 
can say with any confidence is that at all events visual imagery appears to de 
mechanisms located in the posterior portions of the brain, more particularly 
right cerebral hemisphere.” 

In trying to assess Craik's theory, one must bear in mind that modern con 
technology was virtually in its infancy at the time Kenneth was writing. He \ 
the era of analogue computing and his model leans heavily on the principles 
analogue devices. These are devices which embody functions between their ir 
outputs to produce the requisite solution without going through the steps of 
or logic necessary for deriving answers by built-in mathematical or logical pr 
Professor Richard Gregory has pointed out to me, it was not clear at the tir 
writing that calculating machines, as he always called them, could do anythir 
Had he lived to read Turing's famous paper on ‘Computing machines and ir 
published 5 years after Kenneth's death, he might well have put forward a m 
brain more in keeping with the properties of digital computers to mimic any 
machine. Moreover, Turing did suggest an experiment to test his hypothesis 
*think' which might appear, at all events on the surface, more to the point t] 
proposed by Craik.?! 

Richard Gregory has also pointed out to me that, according to Craik's the 
and indeed all symbols are conceived as in some sense ‘in the world’. Had hi 
appreciate the power of digital computers, Gregory suggests, he would have 
their discrete states are very different from what they represent or carry out. 
processing’, he continues, ‘emphasizes the importance of generating procedu: 
certainly not part of the world for they depend on the way the problem is st: 
background concepts and on the limitations of the computer.'?? One further : 
Gregory: he insists, no doubt with good reason, that if the brain works on tl 

. digital device, knowledge of the logical procedures being carried out is more 
than detailed knowledge of its anatomy and physiology. This may be true th 
whether Kenneth would have altogether agreed with him.?? ` 

I have the impression that had Kenneth survived, he would have come to ` 
Nature of Explanation as an immature work and might well have tried to dis. 
students — I assume he would have become a Professor — from taking much o 
in it too seriously. Furthermore, I seriously wonder whether he would have p 
anything further that might have been regarded as a contribution to philosop 
to science. Whether philosophers have paid-any great attention to this little t 
know, though one or two have certainly expressed views about models close 
Craik. For example, Popper remarks in a recently published discussion with 
*one needs models together with animating laws which indicate how the mod 
Taken together, these make up a theory and give an explanation and, almost 
copy of the natural process.' Craik's ideas have of course had considerable ir 
experimental psychologists, and on some physiologists too. The late Lord Ac 
particular, referred in some detail to Craik's ideas in the last of his Waynflet 
delivered at Magdalen College, Oxford, in the Hilary Term of 1946. He com: 
Craik's work particularly in so far as his approach was akin to that of the pl 
intent upon the mechanical possibilities of nerve-cell combinations. It is a sh 
Kenneth did not live to hear or read Lord Adrian's appreciative tribute. 


The mechanism of human action 


As I have said, Craik's war work, apart from producing solutions to a large 
operational problems of very real practical value, fed continuously into his b 
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about the nervous system and human behaviour. Indeed some of his earliest theoretical 
formulations of man/machine interface adaptation and the concept of the human operator 
as an element in a control system, were brought together by his former research assistant, 
Miss Margaret Vince, as two posthumous papers which appeared in the late 19405.?* Miss 
Vince also published two papers in 1948 in which she describes several of the experiments 
on the manual tracking of visual targets which she had carried out during the war under 
Kenneth’s direction.?* This work nof only led Craik to the notions of central intermittency 
and the psychological refractory period, which he outlined in a paper read to The British 
Psychological Society only a few weeks before his death, but also laid the foundations of 
much important work on skill and human performance carried out at Cambridge and 
elsewhere after the war.?” Two of his war-time reports on psychological and physiological 
aspects of control mechanisms, with special reference to tank gunnery, were reprinted in 
1963 by Alan Welford in the journal Ergonomics. Although there is evidence that 
Kenneth stood a little in awe of the mathematical superstructure which Tustin and other 
control engineers erected on the foundation of his experiments, there is no doubt that he 
came to regard tracking, whether manual or visual, as a most important way in which to 
approach basic issues of learning and human performance. 

Thanks to the initiative of the late Warren McCulloch, and the assiduity of Stephen 
Sherwood in sifting through and editing many of Craik's unpublished notes and papers, a 
volume containing much important material was published by the Cambridge University 
Press in 1963.** This includes several chapters — of which alas only two are complete — of a 
book which Kenneth seems to have begun towards the end of 1943 and variously entitled 
The Mechanism of Learning and The Mechanism of Human Action. He was evidently 
intending to write a major treatise on the nervous system envisaged as the instrument of 
adaptive behaviour in animals and man. As the editor of this volume points out, this 
unfinished work contained some of the earliest known references to the relationships 
between learning, cyclical events in the nervous system and servo-mechanisms.*° 

In his introductory chapter, Craik outlines the scope of his projected book. While it may 
be regarded as in a sense a sequel to The Nature of Explanation, it has very little explicit 
reference to philosophy. Craik has come of age as a scientist and we find few references 
even to his own earlier model of the Nature of Thought. He begins by proposing a 
two-pronged attack on the problem of learning and the acquisition of skill. The first he 
calls analytic and it proceeds by way of study of the actual processes, physiological and 
psychological, involved in animal and human behaviour. The second he calls synthetic, and 
it proceeds by way of theoretical investigations of the principles which an organism would 
need to exemplify to show learning. It involves, too, the construction of mechanical devices 
to indicate the possibilities and shortcomings of the various structures and mechanisms 
which may be postulated. Although both approaches have their advantages and 
disadvantages, Craik insists that if our approach is synthetic, we must none the less keep as 
close as we possibly can to.the actual structure and function of the organism whose 
behaviour we are trying to explain. But he considers it permissible to borrow freely from 
the theory and technological practice which has grown up in the attempt to simulate 
human behaviour and to solve what had previously been regarded as problems capable 
only of human solution. 

Craik then considers the nature of purposive behaviour, as seen by psychologists such as 
McDougall. While he clearly believes in the reality of conscious states and purposive 
action, he asserts that we need postulate no fundamental difference between the behaviour 
of organisms and that of certain classes of man-made machine. The principle of feedback, 
or cyclical action, embodied in servo-mechanisms may well, he contends, arise in the course 
of nature in sufficiently complex systems. As is well known, exactly the same point was 
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made in 1943, just about the time Kenneth was writing, in an article by Rosenblueth, 
Wiener & Bigelow called ‘Behaviour, purpose and teleology’.*! In this article, they 
adumbrated the thesis that an organism can be treated as a machine if, and only if, our 
concept of the machine is that of a servo-mechanism embodying automatic control 
principles. I do not think for a moment that Kenneth was aware of this article. Its 
argument was of course later developed in extenso by Wiener in his famous book 
Cybernetics, which appeared in 1947, 2 years after Crdik's death.?* For purposes of 
comparison, I would like to give you Kenneth's statement of his programme in his own 
words as outlined in the introductory chapter: 


On the synthetic approach, we shall go in somewhat greater detail into the theory of 
self-regulating and servo-mechanisms, with their sensory, computing, control-valve and power 
units; their positive and negative feedback with and without time constants; the effect of time 
lags and different control functions on their stability; and the available methods of obtaining 
qualitative variations in response, both as regards spatial and temporal patterns; and the 
possibilities of imitating mechanically the ' grouping' and ' generalising' powers of animals and 
men. 

Similarly, in the analytic approach we shall take examples from existing knowledge of the 
structure and function of sense-organs in man and animals, of the transmission of impulses in 
nerve fibres and synapses, and the control of the muscular responses. Sometimes we may take 
examples from the simpler internal self-regulating systems of the body (such as temperature 
regulation and the regulation of breathing) where, again, the behaviour of the system is 
determined by its requirements, and purposiveness is apparent at an unconscious level.?? 


I think this quotation brings out well the magnitude of the task Kenneth had set himself. 
His next two chapters, of which only the first is complete, are concerned with general 
principles of automatic regulation and their application to biological systems. These are 
now so familiar to biologists that it is unnecessary to consider them further today. The 
fourth chapter, however, concerning levels of functional organization in the nervous system, 
is more arresting. It carries further the hierarchical principle on which Kenneth placed so 
much emphasis in his earlier writings. I find this chapter of particular interest in so far as I, 
too, found this principle of great value in trying to interpret the psychological sequelae of 
brain injury I was engaged in studying at the Brain Injuries Unit in Edinburgh during the 
war. I may add, in parenthesis, that Kenneth was distinctly interested in our work and 
usually contrived to spend some time at our Unit on his occasional flying visits to 
Edinburgh to see his family. On one occasion we showed him a case of jargon aphasia and 
he was muca perplexed by the patient's evident inability to notice the errors in his own 
speech. Indeed he refers to this in an informal paper included in The Nature of 
Psychology. Kenneth had read Hughlings Jackson and Henry Head and certainly believed 
that study of the effects of injury or disease of the brain might well come to throw 
important light on its normal functions. Indeed, I think he saw in the idea of levels of 
function a key concept in trying to build a general theory of the nervous system envisaged 
as the instrument of behaviour. Although this idea has become somewhat unfashionable in 
recent years, J was much encouraged to see it resurrected by Donald Broadbent in his Sir 
Frederic Bartlett lecture published in 1977. I think Kenneth, too, would have approved.?* 

Although there is much of interest in this unfinished work, it is in my view particularly 
valuable as containing the only statement known to me of Craik's views on the nature of 
learning. It is clear that Kenneth felt — and I suspect always had felt — deeply challenged by 
the problems of learning and memory, whether approached from the analytic or synthetic 
point of view. Two things stand out here: First, how preoccupied he was with the role of 
pleasure and pain in relation to learning and how quick he was to see the link between 
Thorndike's Law of Effect and feedback principles. Secondly, how fruitful in his view was 
the study of tracking, which had so preoccupied him during the war, as a basic method in 
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the analysis of human skill. As a form of sensory-motor coordination, he writes, tracking is 
capable of all degrees of complexity from something akin in its simplicity to a conditioned 
reflex to a high-level skill, in which anticipation, prediction and intellectual grasp of the 
problem may all play a part. As is well known, this aspect of Craik’s thinking has had very 
great influence on the development of experimental psychology in Cambridge since the war. 
Had he himself survived, however, I think our applied psychology would have been 
informed by a deeper understanding of the fundamental principles governing human 
behaviour. 

Important as I regard what Craik called the synthetic approach and the various types of 
explanatory model to which it has given rise, I think he would have deplored the 
comparative lack of attention paid by many psychologists today to the analytical approach, 
i.e. to issues of nervous structure and function. The gap between physiology and 
psychology seems, if anything, to have grown wider since Craik's death and even 
behaviourists nowadays pay scant attention to the nervous system. Yet, for Craik, the 
analytical approach was equally essential to an understanding of human behaviour. In his 
discussion of tracking, for instance, he remarks that ‘If the study of these behavioural 
phenomena is linked on the one hand with electro-physiological results and work on 
animals with parts of the brain removed, and on the other hand with the advances of 
sensory and motor machines designed to replace man in such tasks as aiming guns or 
steering ships or aircraft — all of them basically tracking problems — we may begin to see the 
most fundamental similarities and differences between man and man-made machines.’** 
Had Kenneth survived to become Head of the Cambridge Psychological Laboratory, I feel 
sure that it is in some such way as this that he would have envisaged the future. 


The man and his work 


I would like now to turn back from the work to the man. What kind of man was Kenneth 
Craik? What were his strengths and his limitations? What might have been expected of him 
had he survived? I am only too well aware how subjective any attempt to answer such 
questions is apt to be and must disclaim entirely any special qualifications for doing so. 
However, while preparing this lecture I took the opportunity of consulting some 20 people 
who had known Kenneth pretty well as a friend or colleague, often as both, and wrote to 
some others who were not available to talk to me. Everyone I approached gladly agreed to 
cooperate. Such answers as I shall attempt, therefore, are not solely the outcome of my 
own recollections and personal judgement. I have attempted in some measure to arrive at a 
consensus of the views of many of those who knew him well. 

It is hardly necessary to say that no one whom I consulted had the slightest doubt as to 
Kenneth's intellectual brilliance. Just as his Professors considered him to be the best 
student who had ever come their way, so did his colleagues regard him as a man of 
altogether outstanding intellectual distinction. One of his Cambridge contemporaries 
spontaneously remarked, some 30 years after his death, that Kenneth was the greatest man 
he had ever met. I may say that my informant was a man of highly critical turn of mind 
and little given to superlatives. Another psychological contemporary wrote to me that he 
now regards Kenneth as so far ahead of his time that it took many years for most others in 
his field to catch up with him. In this respect, he writes, he can be compared with his fellow 
Johnian, Professor Dirac. Had Kenneth lived, it seems likely that he would have been 
elected a Fellow of the Royal Society at a singularly early age. 

As is well known, Kenneth's theoretical acumen was matched — some might even say 
exceeded — by his remarkable ingenuity and skill as a designer and maker of scientific 
equipment. Adrian has written of Craik's delight in mechanical contrivances of all kinds 
and his great skill as a model maker. Bartlett has related that the very first time he met 
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Kenneth ‘out from his waistcoat pocket came his famous working model of an internal 
combustion engine’. He adds that Craik’s capacity for designing mechanical instruments 
was so outstanding that he might find it hard on occasion to refuse any problem which 
gave him a chance to invent some new piece of apparatus. Among my informants one, a 
fellow Johnian, refers to Kenneth’s engineering flair as in his experience very rare in a 
biological department. Another considers that, as an instrument maker, Kenneth’s skill was 
uncanny. ‘His work’, he writes, ‘was brilliantly designed and very, very well executed.’ 

It is somewhat paradoxical that while Kenneth showed such remarkable skill in doing 
such exceedingly difficult things with his hands, he none the less gave an impression of 
marked clumsiness in the grosser activities of everyday life. Bartlett, for example, has 
contrasted the beauty of his craftsmanship with the possession of a body that ‘was not 
noticeably very biddable’. This was put rather more bluntly by a former colleague who said 
that ‘as a cyclist he was a menace on the roads’. It is conceivable that Kenneth’s mild 
disability due to a congenital hip dislocation contributed to this clumsiness and thereby 
indirectly to his fatal accident. 

But what of Kenneth’s personality? Those of my informants who knew him well 
repeatedly refer to his friendliness, his helpfulness — especially in matters of equipment 
design — and his enjoyment of company and friendly argument. Bartlett recalls him at 
meetings of the Cambridge Psychological Society at which ‘...he would come in, often a 
bit late, sit cross-legged on the floor and then start some lively discussion, sticking to his 
point with persistence and good humour, and with his wonderful enjoyment of his own 
jokes, some of which were very good'.?" A distinguished mathematician, once a fellow 
graduate student with Kenneth at St John's, remembers him as a wonderful person to talk 
to in spite of his virtual lack of mathematical ability. To these tributes I would only add 
that there was an aesthetic side to his nature shown in his love of poetry and music: indeed 
he played the violin and himself wrote poetry, though I do not think any of it was ever 
published. As Bartlett has written, most of the strongest influences in Kenneth's life came 
from people with profound artistic and humane interests.5* These were never wholly eroded 
by the scientific preoccupations of his maturity. 

On the surface then, we have the picture of a contented, very active and moderately 
sociable man, much preoccupied with his work but not without his hobbies and forms of 
recreation, which included sailing and canoeing. Sometimes these virtually coincided, as in 
his love of gadgetry and model making. But those of my informants who knew him really 
well are convinced that there was a less happy side to his nature. Indeed this finds some 
confirmation from scrutiny of the few very personal papers which Sherwood and his 
colleagues thought fit to include in their selection. In so far as it may have some bearing on 
his work, I find it my duty to make some brief reference to these more sombre aspects of 
his personality. 

First of all, it seems clear that Kenneth was at bottom a very lonely man. His sometimes 
agonizing loneliness is well described in the following lines: 


Emotionally, I have always felt life to be a struggle to break down a wall that enclosed me on 
every side; it is not so much in one direction, as in any, that it requires to be broken Itis a wall 
composed of one's ignorance of other people's thoughts and feelings, and the separation of space 
and time, but most of all, perhaps, composed of one's own weakness of insight, insensitivity and 
poverty of expression.?? 


Although these lines may well have been written in a passing phase of depression, most of 
those to whom I have been able to talk agree that Kenneth appeared to have had no really 
close friends of either sex with whom he could discuss his deeper problems. For deeper 
problems there undoubtedly were. Bartlett considered that his outstanding capacity for 
designing and making equipment of all kinds derived in the main from certain intellectual 
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conflicts, never wholly settled, some of them with emotional roots deep in his earlier years. 

Secondly, no one to whom I spoke about Kenneth has failed to mention his liking for 
uncomfortable things, such as performing experiments, with himself as subject, which 
caused him to run risks or to experience discomfort or pain. As one war-time colleague put 
it, he was always dreaming up new ways to mortify himself a little. For example, on one 
occasion he consumed a very large quantity of cider mixed with absolute alcohol in order 
to establish the effects, if any, of alcohol on visual acuity. Whether he found such effects T 
do not know, but he certainly made himself horribly ill! On another occasion, this more 
serious, he gave himself a central scotoma by looking with one eye at the noonday summer 
sun for a period of 2 minutes. The resulting effects, immediate and more long-lasting, he 
has described with scrupulous care in two short scientific papers, unpublished at the time, 
but which have since appeared in print.*° I have been told that Sir Stuart Duke-Elder, the 
eminent ophthalmologist, whom Kenneth greatly respected, warned him in no uncertain 
terms of the dangers of carrying out such experiments. Whether he heeded the warning I do 
not know but think it improbable. 

This same tendency. appears also in Kenneth's apparent delight in finding himself 
unexpectedly in positions of danger. Bartlett has given a graphic description of one such 
instance in which, in company with Bartlett himself, he narrowly escaped being seriously 
hurt in a car accident. After it was all over, he was grinning happily. ‘This’, Bartlett adds, 
*was just what he enjoyed.' I can add another example: when on a scientific mission, 
together with a colleague, their plane and another flew over a test course. The plane in 
which he and his colleague were flying landed safely but the other plane crash-landed on 
the shores of the Bristol Channel, fortunately without loss of life. Later that night, Kenneth 
remarked to his colleague: ‘We were in the wrong plane today you know!’ He obviously 
experienced genuine chagrin that it was not he who had only just made it to safety. 

Some of my informants have used the term masochism to denote this peculiarity of 
Kenneth's. But I do not believe that this is the appropriate term. So far as I am aware, it 
was not linked to any form of gratification of a sexual kind. It was much more an outcome 
of Kenneth's wish to test himself out, to prove to himself that he could dispense with 
comfort or overcome fear. To exhibit physical prowess, oddly, appeared to mean much 
more to him than intellectual distinction, which he always seemed to take somewhat for 
granted. Some may see in this an Adlerian compensation for his physical handicap, which 
might also be made responsible for his constant self-driving and near-compulsive dedication 
to his work. As one of his oldest Cambridge friends put it: 'Kenneth was being pursued by 
some Calvinistic demon.’ 

What, then, were Kenneth Craik's deeper problems? In my view, they were in part 
emotional and in part intellectual in origin. As regards the emotional aspect, Craik 
undoubtedly had problems of sexual relationship and identity of a kind which are by no 
means uncommon, especially perhaps among people who have had exceptionally close and 
influential relationships with their mothers. Craik was just such a person. Everyone to 
whom I have spoken who. knew him at all well has testified to the strength of his 
relationship with his mother, to whom he wrote regularly once a week during the whole of 
his life in Cambridge. Yet at the same time, he often seemed anxious to liberate himself 
from this bond and indeed confessed that he got on much better with his mother ‘on 
paper' than face to face. One need not be an avowed Freudian to appreciate that his 
relationship with his parents, though in many respects happy and rewarding, was at a 
deeper level exceedingly complex.* 

What, then, of the intellectual conflicts to which Bartlett has alluded? These I suspect 
arose from a clash between Kenneth's empirical interests, under which I subsume his 
delight in things and their making, and his philosophical interests, under which I subsume 
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his passion for scientific explanation, more especially of man and his nature. These patterns 
of interest are not of course necessarily in conflict. As R. C. Oldfield once pointed out, 
possibly with Kenneth in mind, a tendency that determines in part a man's interest in 
philosophy may also determine in part his interest in model engineering.*?? None the less, I 
do think Kenneth possessed two somewhat opposed patterns of interest: one systematic, 
which led him to philosophy; the other empirical, which brought him to science. As with 
many before him, this conflict found an effective resolution in experimental psychology. It 
did so because psychology brought Kenneth's empirical and practical talents within the 
legitimate sphere of his theoretical preoccupations. He was too much of a scientist to be a 
wholly successful philosopher and too much of a philosopher to be a straightforward 
physicist or engineer — possibly even a physiologist. But as an experimental psychologist he 
was superb and, had he lived, might quite conceivably have become one of the best in the 
world. 

I am confident, too, that Kenneth would have proved loyal to his quest for basic 
principles in psychological explanation. While no doubt he would have continued to make 
his services, and those of the MRC Unit of which he was so briefly Director, available to 
help in the solution of many practical problems of immediate, possibly vital, concern to the 
welfare of Britain in the post-war world, I am none the less convinced that his first priority 
would always have been the advancement of scientific knowledge. In these days when 
teaching, administration and applied research loom so large on the academic scene, it is 
worth bearing in mind that Kenneth Craik was the kind of man who, in his life and work, 
embodied what John Henry Newman so aptly called the Idea of the University. 
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The Craik and Marshall building at Cambndge, 
named after K. J. W. Craik and F. H. A. 
Marshall, is administered by the Departments of 
Physiology and Experimental Psychology. The 
section devoted to vision research is known as the 
Kenneth Craik laboratory. 

I owe this information to James Drever, Jr, until 
recently Vice-Chancellor of the University of 
Dundee, who was an undergraduate contemporary 
of Kenneth Craik’s at the University of 
Edinburgh. 

For Sir Frederic Bartlett on Craik, see his 
‘Obituary Notice: Kenneth J. W. Craik’ in the 
British Journal of Psychology, 1946, 36, 109-116. 
This is reprinted in K. J. W. Craik (1966), The 
Nature of Psychology, edited by Stephen L. 


Sherwood, pp. xiii-xxi (Cambridge University 
Press, Cambridge). All subsequent references to 
Bartlett refer to this source. 

It was widely rumoured that strong pressure was 
put on Craik by Kemp Smith to consider himself a 
candidate for the Professorship of Psychology at 
Edinburgh on the retirement of Professor James 
Drever, Sr, ın 1944. Craik himself told me at the 
time that he had no intention whatsoever of 
leaving Cambndge so long as Professor Bartlett 
remained as Head of the Department of 
Experimental Psychology. 

Bartlett (1946), p. 109 (see Note 3). 

Craik's PhD thesis (1940) and his Fellowship 
Dissertation (1941) embody substantially the same 
material. Access to the former may be had at the 
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Cambridge University Library and to the latter by 
application to the Honorary Libranan, 
Department of Experimental Psychology, 
Cambridge. 

See K. J. W. Craik (1944), ‘Medical Research 
Council Unit for Applied Psychology’, Nature 
(London), 154, 476. 

Seventy-eight discrete items are listed in the 
bibliography prepared by S. J. Macpherson and 
appended to Bartlett’s Obituary Notice. With two 
additions, this bibliography is reprinted in K. J. W. 
Craik (1946), The Nature of Psychology, pp. 
179-181 (see Note 3). 

This 1s stated by Warren S. MacCulloch, Leo 
Verbeck and Stephen Sherwood m Craik, op. at., 
p. x. Unfortunately, they do not give the precise 
sources of their information. 

Bartlett (1946), p. 110 (see Note 3). 

Full reports of this series of experiments are to be 
found in: ‘The effects of adaptation on differential 
brightness discrimination’, Journal of Physiology, 
1938, 92, 406-421, ‘The effects of adaptation on 
visual acuity’, British Journal of Psychology, 1939, 
29, 252-266; and ‘The effect of adaptation on 
subjective brightness’, Proceedings of the Royal 
Society, series B, 1940, 128, 232247. 

K. J. W. Craik, ‘Visual adaptation’ Unpublished 
Fellowship Dissertation, St John’s College, 
Cambndge, 1941, p. 101. 

For Craik's work on pressure blindness, see his 
paper on ‘Origin of visual after-images' (1940) 
Nature (London), 145, 512, K. J. W. Craik & 

M. D. Vernon (1941), *The nature of dark 
adaptation', British Journal of Psychology, 32, 
62-81. 

Craik's views on the relation of adaptation to 
perceptual processes are developed in ch. 7 (pp. 
131-158) of his Fellowship Dissertation (see Note 
12). He directed particular attention to the 
relationship between peripheral and central 
mechanisms ın vision and to the wider issues of 
levels of function in the central nervous system. 
The latter theme 1s discussed in more general 
terms in K. J. W. Craik (1966), pp. 38-53 (sec 
Note 3). 

K.J. W Craik (1941), ‘The functions and 
mechanisms of sensory adaptation’, pp. 159-181, 
chapter 8 of unpublished Fellowship Dissertation 
(see Note 12). 

For Craik's criticisms of Gestalt theory, see his 
unpublished Fellowship Dissertation, 1941, ch. 7, 
*Perceptual problems involving adaptation', pp. 
131-158 and K. J. W. Craik (1943), pp. 78-80, 114 
(see Note 18). For the Craik-Zangwill experiment, 
see ' Observations relating to the threshold of a 
small figure within the contour of a closed 
Ime-figure’, British Journal of Psychology, 1939, 
30, 139-150 

The ‘photometer of marvellous simplicity’ features 
in two unpublished reports written in 1943 and 
numbered 44 and 47 in Macpherson's 
bibliography (see Note 8) The original instrument 
was presented to the Kenneth Craik Laboratory 
by Dr J. A V Bates in 1978 

K. J. W. Craik (1943), The Nature of Explanation, 
Cambridge University Press, Cambridge This 
little volume has been reprinted and is now 
available in paperback. 
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For Adnan on Craik, see E. D. Adrian (1947), 
The Physical Background of Perception (The 
Waynflete Lectures, 1946), pp. 82-83, 87-89, 
93-95, The Clarendon Press, Oxford. 

See, for example, O. L Zangwill (1976), ‘Thought 
and the brain’, British Journal of Psychology, 67, 
304—314. 

For Turing on minds and machines, see his 
‘Computing machines and intelligence’, Mind, 
N.S 1950, 56, 443—460. 

R. L. Gregory (1978), Personal Communication. 
For Gregory on brain models, see R L Gregory 
(1977), ‘Eye and brain machines’. Chapter 13 of 
Eye and Brain: The Psychology of Seeing, 3rd ed., 
pp. 226-240, Weidenfeld and Nicolson, London. A 
fuller account of this topic is in preparation. 

The reference to Popper is K. Popper & J. C. 
Eccles (1978), The Self and Its Brain, p. 465, 
Springer International, London For Adrian's 
tribute to Craik's work, see E D Adnan (1947), 
p. 95 (see Note 19). 

K. J. W. Craik (1948), *Theory of the human 
operator in control systems I: The operator as an 
engineering system’, British Journal of Psychology, 
38, 56—61. 'II. Man as an element in a control 
system’, ibid., 142-148. Also relevant are J, A. V. 
Bates & W. E. Hick (1948), ‘The human operator 
of control mechanisms: Physiological and 
psychological attributes of a human link in 
continuous following and regulating devices’, 
published by The Ministry of Supply; W. E Hick 
(1949), *Some studies of motor skill and their 
implications for a theory of brain mechanisms’, 
unpublished thesis submitted for the MD Degree, 
University of Durham (a copy of thus thesis is 
available in the Library of the Department of 
Experimental Psychology at Cambridge). 

M. A Vince (1948), ‘The intermittency of control 
movements and the psychological refractory 
period’, British Journal of Psychology, 38, 
149-157; and ‘Corrective movements in a pursuit 
task’, Quarterly Journal of Experimental 
Psychology, 1, 85-103. 

A good account of this work 1s given by A. T 
Welford (1968) in his Fundamentals of Skull, 
Methuen, London. For Craik’s contnbution to the 
background ideas, see particularly, pp. 13, 15, 24, 
105-106, 114—115, 120, 194—195. 

These papers appear as Nos. 49 and 67 in 
Macpherson’s Bibliography (see Note 8) See also 
Ergonomics, 1963, 6, 1-33; 419-440. 

K. J. W Craik (1966) (see Note 3). 

K.J. W Craik, ibid , p. 6. 

Phuosophy of Science, 1943, 10, 18-24. 

Norbert Wiener (1948), Cybernetics, Technology 
Press, Wiley, New York 

K. J. W. Craik (1966), p. 20 (see Note 3) This 
extract 1s cited by kind permission of the 
Cambridge University Press. 

K J W Crank, ibid., pp. 159-164. 

D. E Broadbent (1977), ‘Levels, hierarchies and 
the locus of control’, Quarterly Journal of 
Experimental Psychology, 29, 181—201. 

K. J. W. Craik (1966), p. 48 (see Note 3) 

Bartlett (1946), p. 110 (see Note 3) 

Bartlett, ibid., pp 113-114. 

This extract ıs taken from a note of Craik's which 
appears in The Nature of Psychology, p 30 (see 
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Note 3) To it, the editor has given the title ‘Life’. behaviour of self-harm, omnipotence and 
40 ‘On the effects of looking at the sun’ and exhibitionism. He has little doubt that Craik's 
‘Localized Aniseikonia following eclipse blindness’ relations with his parents are key factors m 
In K J. W. Craik (1966), pp. 98-101, 102-103 (see arriving at an understanding of his personality 
Note 3). 42 R. C. Oldfield (1939), ‘Some factors in the genesis 
41 Ina pnvate letter to me, Professor H. J Walton of interest in psychology’, British Journal of 
places emphasis on the combination in Craik’s Psychology. 30, 109-113. 
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Cognitive style, set and sorting strategy 


Kieron P. O’Connor and G. H. Blowers 





Cognitive style was examined as a predictor of subjects’ preference for response strategy, as opposed 
to attainment of a given response level, in a sorting task. Frame dependence, as scored by Nyborg’s 
analysis of rod and frame test performance, and morphophilic sorting behaviour were compared in 20 
normal subjects, and found to be correlated at the P « 0 001 level of significance. From further 
analysis of the sorting task it is argued that differences in attentional *set' selection can account for 
different sorting strategies employed. Field-dependent subjects prefer to adopt a stimulus set and 
field-independents, a response set, 1n selective attention. 





Cognitive style 


Witkin and co-workers (1954, 1962, 1965, 1977) have proposed that a field-dependent (FD) 
or field-independent (FI) ‘cognitive style’ is a pervasive and stable personality 
characteristic predictive of individual performance differences over a range of cognitive 
areas including social-interpersonal, learning, memory, defence-control, and concept 
attainment behaviour (Goodenough, 1976). 

Cognitive style is most commonly measured by some variant of either the embedded 
figures test (EFT) or the rod and frame test (RFT) which measure specifically perceptual 
modes of viewing parts of a field as discrete from or fused into an organized background. 
Thus the subject who can easily locate a simple figure in a complex design (EFT) or adjust 
a rod to true vertical in the presence of a distracting background (RFT) is said to have 
adopted an FI cognitive style rather than an FD style. Witkin claims that overcoming 
embeddedness is the single factor that accounts for field-dependence-independence (FDI), 
and his basis for generalizing this factor to conceptual as well as perceptual performance, is 
the relationship he and his co-workers have subsequently found between measures of FDI 
and other measures involving adaptive flexibility, match problems and analytical subtests of 
intelligence (Witkin et al., 1962). 

However several studies challenge Witkin's notion that a general embeddedness factor 
can account for performance on conceptual tasks. Thurstone (1944), Botzum (1951), 
Pemberton (1952) and Karp (1963) have all suggested that tests loading on embeddedness 
also require the subject to overcome distraction. Karp (1963) has shown that ability to 
overcome embeddedness is factorially distinct from ability to resist distraction, thus the 
same figure/ground performance could entail two different abilities. Likewise Hustmeyer 
(1970) has suggested that a general intelligence factor, as well as a general overcoming 
embeddedness factor can account for FDI performance. Whilst Gordon & Tikofsky (1961) 
and Russo & Vignolo (1967) looking at the internal structure of the EFT have proposed 
that visuo-spatial ability factors better account for EFT performance than a general 
figure-ground factor. Similarly Silverman (1968) and Boersma et al. (1969) have claimed 
that different abilities in either scanning control, or selective figure-ground articulation can 
equally lead to successful performance in the RFT. 

Witkin et al. (1977) have since argued that differences in cognitive style indicate 
differences in process rather than ability. That is they predict how a subject will approach a 
task, rather than the specific ability involved in its performance, and so reflect an 
underlying functional level of response organization that cuts across conventionally 
separated areas of ability. This implies that though subjects within the FD or FI group 
may differ in ability to perform a task, they will show consistency in the approach or 
cognitive strategy they employ in performance. At present strategy differences remain 
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unconfirmed since Witkin’s criterion measures of FDI are measures of performance ability 
rather than preference. That is to say they do not measure choice of approach, but only 
separate those who can from those who cannot articulate item from context. 

If cognitive styles do describe unique pervasive characteristics of personal response 
organization, then FD? and FI? should differ consistently in the response strategies they 
employ in performance of a figure-ground task. 

The present study tests the association between cognitive style and sorting strategy 
chosen in a colour—form sorting task. 


' Set and sorting strategy 


Goldstein & Scheerer (1941) originally proposed that sorting by colour or form entails 
mutually exclusive response strategies adopted under a specific type of cognitive style or 
set. Sorting by form or class entails adopting the appropriate set to group items in different 
contexts, and form conceptual or categorical appurtenances, whereas in colour dominated 
sorting the appropriate set is unconceptual, and receptive only to the particular 
concreteness of each item, and its specific colour attributes. 

Sorting salient items from a background of irrelevant cues involves figure-ground 
articulation, but whereas FDI tasks such as the EFT or RFT give information only on 
perceptual performance, colour-form sorting gives information on appropriate conceptual 
strategies employed in performance. 

Vignolo (1968) has reported a consistent relationship between colour-dominated sorting 
and field dependence amongst different brain-damaged groups. However here cortical 
restriction may have precluded choice of ‘set’, and it is difficult to separate sensory from 
cognitive influences in selection of strategies. 

An important distinction must be drawn between achieving correct selection of relevant 
and irrelevant items, predefined so by the experimenter, and allowing the subject to define 
figure and ground according to preference. Normal subjects will be able to sort by form or 
colour on request, and their success will depend on various ability, motivational and 
situational factors. However their preference for sorting by one or other criterion is more 
likely to reflect habitual strategies and processes underlying stimulus selection and response 
organization. 

Several studies have shown that normal subjects do show preferences in selection of 
colour-or form-dominated grouping strategies (Goldstein, 1943; Reitan, 1958; Goldstein et 
al., 1968) and that this selection reflects organized cognitive strategies (Fleishman, 1976) 
and not just sensory influences (Brothers & Gaines, 1973). Tien (1960) has devised and 
validated a card-sorting task which scores colour-form preferences along a 
morphophilia-chromaphilia dimension. He has reported that patients with reading 
disability, affective disorders and schizophrenics tend towards chromaphilia (Tien & 
Clarke, 1964; Tien, 1966), and Witkin (1965) has likewise reported that such groups tend 
to be FD. This indirect evidence suggests a link between FD and chromaphilic sorting 
preference. To test this association directly sorting strategy as measured along Tien's 
chromophilia-morphophilia dimension and cognitive style as measured by RFT 
performance were compared in 20 normal subjects. 


. Method 


Subjects were 20 volunteers screened for normal visuo-spatial ability from a range of skilled 
occupations: 10 females, 10 males, mean age 29, sp 9-89. 

The rod and frame apparatus was a portable plywood box housing a square frame and rod coated 
in luminous paint. The subject viewed the rod and frame through a viewing area shaped like a diver's 
mask. The distance from the viewing area to the frame was 57 cm, and the distance from one edge of 
the frame to the other subtended a visual angle of 28? 
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Field dependence is operationalized in the RFT in terms of frame dependence, and is measured by 
determining how successful the subject is in adjusting a rod within a tilted frame, to a physically 
vertical position. The experimental procedure involves two presentations of four combinations of rod 
and frame tilt, each at 28° clockwise or anticlockwise rotation from true vertical. 

Subjects were classified as field dependent (FD) or field independent (FI), according to Nyborg’s 
method of scoring RFT performance (Nyborg, 1974). Nyborg’s method enables separation of the 
subjects frame-effect error component (that is, error due specifically to the tilt of the frame) from the 
mean unsigned error which is the traditional RFT measure of field dependence (Witkin, 1962). The 
significance of individual sources of variation can also be estimated by students ¢ distribution when 
the standard deviation or response consistency factor (é) is taken into account. It has been argued 
elsewhere (O’Connor & Shaw, 1978) that use of Nyborg’s criterion of frame dependence rather than 
the usual mean error score leads to less confounding between cognitive and sensory tonic factors in 
performance. 

Subjects were scored as field dependent (or independent) depending on whether (or not) they 
produced a significant (P < 0-05) frame effect, using this method. 

The colour-form sorting task (CFT) used a picture-splitting technique whereby the subject was 
presented with 10 sets of three pictures. All pictures were incomplete. Parts of them were cut off to 
make identification more difficult. The only exception was the control set which had two identical 
pictures. The subject was asked to indicate which two pictures were most alike. One of the pictures 
bore a form likeness, the other a colour likeness to the criterion picture, but likenesses varied in 
degrees over the 10 sets. The CFT score was derived from the sum of different percentage form 
weightings allocated to each set (Tien, 1960). 

The CFT and RFT were administered on separate occasions. No time limit was umposed on either 
task, though the subject was asked to report 1mmediately he had completed the task. 


Results 


Table 1 gives the CFT and RFT test scores for each subject. The frame effect component 
and response consistency factor as computed from Nyborg's method of analysis are 
tabulated together with the / values on which frame dependence was based. Tien considers 
scores above 60 to be reliably morphophilic. Nine of the 11 FI subjects, and one of the 
eight FD subjects score in the morphophilic range. Fisher's exact probability test upholds 
an association between FDI and morphophilia-chromaphilia with an exact probability of 
0-0027. At an ordinal level Spearman's rank-order correlation coefficient (wa) gives a 
correlation of —0-69, t = 4-1, P < 0-001 (two-tailed), between degree of frame dependence 
(1$) and degree of morphophilia. Subjects thus have the same relative position on both 
tasks, which indicates measurement of a common performance factor. 


Discussion 

If a subject is FD then the present results show that there is a high probability that he will 
adopt a more chromaphilic sorting strategy in sorting stimuli than a FI subject. The choice 
between sorting strategies is a choice of cognitive set, and in the present study reflected a 
choice consistent with cognitive style. 

Broadbent (1970) has accounted for different sorting strategies by proposing a distinction 
between adoption of a stimulus set or a response set in selective attention. When relevant 
and irrelevant stimuli are discriminated by an obvious physical characteristic such as size, 
intensity or colour, the subject has adopted a stimulus set to selection by simply filtering 
possible stimulus input. Response set, on the other hand, restricts possible responses. Thus 
a task which requires selection of items by a shared abstract category such as form or class 
rather than a physical attribute requires response set. 

In the present study the FDs chose to sort by colour, and the FIs by form. In 
Broadbent's terms FD subjects adopted a stimulus set, while FI subjects adopted a 
response set, under conditions which distinguished the choice. Thus FI subjects are 
prepared to search for conceptual groupings and relations within a stimulus configuration, 
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whilst FD subjects are prepared to group by more overt attributes. Eye-movement 
strategies monitored during RFT performance do show that FD subjects look at overt 
stimulus features longer and more frequently than FI subjects (Blowers & O'Connor, 1978). 
This would support ‘set’ differences between the groups. 

Previous studies have shown that FD subjects sort more efficiently when more noticeable 
stimulus cues are favoured as criteria, and so not all stimuli have equal salience value for 
each subject (Dickstein, 1968). Thus FD and FI subjects may actively favour different cues 
as a criterion bias effect resulting from a set readiness towards different input selection. 

The efficacy of selecting through a stimulus or a response set will depend on the task 
demand, but response set is more likely to hold figural attensity constant amidst physically 
similar simultaneous background input. Thus conventional FDI tasks will require adoption 
of response set for successful performance and so favour the FI. However a task where 
stimulus relations are held constant and overt target characteristics varied, would be better 
performed through adoption of stimulus rather than response set, and so favour the FD 


approach. 


In conclusion the results indicate that FD and FI subjects do show consistent response 
strategies in performance, as is implied by Witkin's ‘process’ vs. ‘ability’ model of FDI 
cognitive styles. The results further suggest that these different information-processing styles 
are operationalized in terms of different attentional ‘sets’ and strategies adopted towards 


control of stimulus input. 


Several processing models have been proposed to account for figure-ground performance 
differences. These have centred largely on global vs. analytic (Witkin, 1965) simultaneous 
vs. successive (Das et al., 1975) and right vs. left hemisphere differences (Cohen et al., 1973) 
in control of information processing. Such differences may be valid but when attempting to 
operationalize these processing systems, authors have been forced into analysing 
performance scores obtained across cognitive tasks, and so dealing with task-specific 
abilities, even if they prefer to label them as processes or modes of integration (Vernon et 


. al., 1978). 


It is proposed here that manipulation of the attentional set required within a task, will 
give more information on FDI processing differences than further generalization of a 
figure-ground ability factor across more cognitive areas. 
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Measurement in psychology 


C. O. Fraser 





Two major traditions regarding the theory and practice of measurement in psychology can be 
identified. the classical approach associated with the work of S S. Stevens, and the more recently 
developed axiomatic approach. These two approaches are compared and related to the historical 
development of the logical analysis of measurement. 

Some practical consequences of the choice of a measurement theory paradigm are discussed, and 
the appropriateness of the two major approaches to guide the theory and practice of psychological 
measurement are evaluated. 





Terms such as ‘measurement’ and ‘scaling’ have been used fairly loosely in the physical 
and social sciences with little general agreement on precisely what they should be taken to 
mean. The question of the measurement of psychological attributes in particular has been 
surrounded by considerable controversy. While various procedures have been developed for 
the quantification of concepts such as attitudes and sensations, this has been until recently 
in the absence of any logical analysis of their status and justification. The only methods of 
evaluating the quality of such measures has been a collection of essentially ad hoc statistical 
techniques subsumed under the titles ‘reliability’ and ‘validity’. Consequently, it is hardly 
surprising that most psychological measurement has been regarded as being quantitatively 
and qualitatively of a lower order than physical measurement. Recent developments have 
shown, however, that while many of the measures obtained in psychology may be of 
substantially lower precision than most physical measurement the theoretical status of the 
measures can in fact be justified to an equivalent degree. 

The determination of precisely what is or is not to be regarded as measurement, and of 
qualitative distinctions between different types of measurement, can only be derived from a 
logically consistent theory of measurement. There are two distinct approaches or 
metatheories of measurement that include psychological attributes within their domain; 
what is sometimes called the classical approach due to Stevens (1939, 1946, 1951, 1957) and 
the axiomatic approach (e.g. Suppes & Zinnes, 1963; Coombs, 1964; Krantz, 1968, Krantz 
et al., 1971). 

The aim of this review is to compare the essential nature of each of these two 
approaches, to consider their contribution to the historical development of the logical 
analysis of measurement, and to evaluate them as useful paradigms to govern the theory 
and practice of the measurement of psychological characteristics 


Historical foundations of measurement theory 


Measurement has always been regarded as a restricted set of operations from the general 
domain of predicative acts like classification, 1dentification, description, comparison, etc. It 
has also always been agreed that measurement acts are restricted to those involving the 
assignment of numbers. Some authors (e.g. Nagel, 1931) have pointed out that this 
restriction is essentially arbitrary, chosen more by convention than necessity. Numbers are 
simply a more precise and unambiguous way of delimiting and fixing our ideas of things, 
and thus it is convenient to reserve a term for numerical evaluations. 

Up until the early 20th century the philosophical approach to measurement was based 
largely on the principle of Platonic idealism. That is, there exists a specific amount of the 
property of an object that is to be measured (denoted its ‘ magnitude’), and the goal of 
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measurement 1s to assign a numerical value or quantity that accurately identifies this 
magnitude. The measurability conditions that this imposed on magnitudes were first 
articulated by Hólder (1901) in his classic formulation of the axioms of quantity. These 
specified the conditions that quantities and therefore magnitudes, had to obey in order to 
be used 1n mathematical equations. They are listed below with only slight modification. 

(1) Either a > b, or a < b, or a =b. 

(2) Ifa >b, and b >c, then a >c. 

(3) For every a there is an a’ such that a = a’. 

(4) If a> b, and b = b’, then a> V. 

(5) If a= b, then b = a. 

(6) For every a there is a b such that a > b (within limits). 

(7) For every a and b there is a c such that c = a+b. 

(8) a+b >a. 

(9) a+b=8a +b. 

(10) a+b = b+a. 

(11) (a+b)+c =a+(b+o). 

(12) If a < b there is a number n such that na > b (also within limits). 

These axioms are expressed in terms of the actual quantities of specific elements, with the 
notation a’ referring to an element distinct from ‘a’, but possessing the same quantity. 

The first sıx axioms involve identification of relations of order or equality among single 
quantities, while the last six require the further condition that one can add quantities to 
make a new quantity. i 

The first decisive break from this approach was initiated in the scientific theories of Ernst 
Mach which incorporated a reanalysis of the status of quantitative physical concepts such 
as mass and temperature. 

The first complete exposition of the logical foundations of physical measurement was 
provided by N. R. Campbell (1920, 1921, 1928, 1938). Campbell saw measurement as the 
demonstration of isomorphism between the idea of quantity and the magnitudes of the 
property to be measured. The way to do this was to demonstrate that the magnitudes 
obeyed the axioms of quantity developed by Hólder. 

Measurement thus depended on being able to observe relations between physical objects 
as a consequence of performing an empirical operation. This empirical operation was the 
defining operation of the measurement. There was no meaning to the concept of two 
objects being equal with respect to some quality without first defining the operation used to 
test equality. 

The problem with this was that while these axioms were empirically testable for 
properties like mass where putting objects on each side of a balance clearly corresponded 
to the mathematical operations of ordering and adding quantities, many properties did not 
possess such empirical operations. Density for example, possesses an operation for ordering 
objects (which floats on which) but not for adding them. However, since most physical laws 
required additive quantities this was not regarded as an operation of any value in terms of 
measurement. This empirically demonstratable capacity for addition was regarded as 
representing a critical distinction between properties in terms of their measurability; those 
that had such an operation were denoted extensive properties, while those that did not 
were intensive. 

However, many important properties in physics were intensive and yet were still capable 
of being measured successfully. Measures on these properties were derived as a consequence 
of mathematical laws relating them to extensive properties. For example, density was 
derived as the ratio of mass to volume. Campbell thus distinguished two kinds of 
measurement that could produce quantities that could be used in mathematical equations 
(e. what we would now call a ratio scale); direct or fundamental measurement which 
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required the existence of an extensive property, and derived measurement which required a 
mathematical relationship with one or more extensive properties. A distinction is 
sometimes made between measures derived from a direct association with a single extensive 
property and those resulting from a mathematical relationship with two or more extensive 
properties. Ellis (1960) denotes the former as associative measurement. Associative 
measurement requires the assumption that a relation on the extensive property implies the 
same relation on the property being measured (e.g. relations on the expansion of mercury 
to indicate relations on temperature). The latter, denoted surrogative measurement, derives 
an actual numerical value for the ‘surrogate’ of the property being measured. (This term 
was proposed by Meinong, 1914, p. 219.) In both cases, however, they require a theoretical 
relation with the extensive property to have been established. 

Campbell clearly took the view that relations can only be exhibited among physical 
objects, and hence all measures must be ın terms of empirically observable relations among 
actual objects. (Although this implies that scales can only be relative they can of course be 
made absolute by referral to a set of standard objects for which absolute quantities have 
been determined, such as the length of the emperor’s arm or the thickness of his thumb.) 
The actual means by which measures are obtained of course need not be in this manner. 
Many properties which are capable of fundamental measurement are more conveniently 
measured in a derived manner e.g. weight by the amount of extension of a spring. 

This treatment of measurement theory thus takes a fairly restricted view of what can 
constitute measurement. Only one type of scale was countenanced; ‘quantity’ had one set 
of properties and these were strictly not negotiable. Magnitudes had to meet these in their 
entirety or they could not be measured. This, together with the insistence that relations 
could only be exhibited between physical objects was criticized by several physical theorists, 
e.g. Russell (1937, pp. 164 ff.). The system also clearly offered no possibility of extension to 
include psychological attributes. Campbell in fact argued that such attributes, being 
intrinsic, were incapable of measurement. He was a member of a committee set up to 
consider whether sensations in general, and the sone scale of loudness in particular 
constituted measurement. In its Final Report (1940) it considered that 

any law purporting to express a quantitative relation between sensation intensity and 

stimulus intensity is not merely false but is in fact misleading unless and until a 

meaning can be given to the concept of addition as applied to sensation. 
However, even many physical measures are in difficulty in this regard. No external meaning 
can be given to statements such as ‘twice as dense’, ‘twice as hot’, or even the addition of 
two temperatures or densities (at least not in the sense of adding two objects with the 
particular temperatures, etc., involved). This is one of the major points on which Russell 
takes issue with Campbell’s approach. Russell argues that each magnitude is qualitatively 
distinct and thus cannot be added. The addition of two magnitudes yields simply two 
magnitudes, while the addition of two quantities does give a new single whole (1937, p. 
178). 

The problem thus really arises from a lack of the appreciation of the difference between 
the physical magnitudes and the formal system (the quantities) used to describe them. It is 
in fact the explicit recognition of this distinction that constitutes one of the main 
characteristics of the axiomatic approach. However, in order to maintain the historical 
perspective the psychological measurement theories will now be considered in their 
chronological order. 


Classical measurement theory 


Stevens’ main innovation was to remove the restriction that numbers assigned as measures 
had necessarily to obey the laws of quantity, and introduced the possibility of other types 
of scales. Stevens’ approach was to define measurement broadly as ‘the assignment of 
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numbers to objects or events according to rules’ (1951, p. 1) and then to distinguish 


different types of such assignment. The type or level of measurement was classified in terms 
of the type of transformations that leave the scale form invariant (i.e. so that it still obeys 


the rule). 


Stevens’ system thus extended Campell’s to allow for four scales, nominal, ordinal, 
interval and ratio, to represent four possible classes of rules for assigning numbers. The 
system is summarized in Table | which shows for each scale the empirical operations 
needed to create it and the transformations which are permissible for each scale type and 


which serve to define them. 


Table 1. A classification of scales of measurement? 


Basic empirical 
Scale operations 


Mathematical 
group structure 


Typical examples 





Nominal Determination of 
equality 


Ordinal Determination of 
greater or less 


Interval Determination of 
the equality of 
intervals or 
of differences 


Ratio Determination of 
the equality of 
ratios 


Permutation group 
x' — f(x) 
where f(x) means 
any one-to-one 
substitution 


Isotonic group 
x =f(x) 
where f(x) means 
any increasing 
monotonic function 


Linear of affine 
group 
x’ =axt+b 
a0 


Similarity group 
x'-—cx 
c»0 





*Numbering' of football 
players 

Assignment of type 
or model numbers 
to classes 


Hardness of minerals 
Street numbers 
Grades of leather, 
lumber, wool, etc. 
Intelligence test 
raw scores 


Temperature (Fahrenheit 
or Celsius) 

Position 

Time (Calendar) 

Energy (potential) 

Intelligence test 
‘standard scores’ (?) 


Numerosity 

Length, density, work, 
tume intervals, etc. 

Temperature (Rankine 
or Kelvin) 

Loudness (sones) 

Brightness (Brils) 





a Measurement is the assignment of numerals to events or objects according to rule. The rules for 
four kinds of scales are tabulated above. The basic operations needed to create a given scale are all 
those listed in the second column, down to and including the operation listed opposite the scale. The 
third column gives the mathematical transformations that leave the scale form invanant. Any 
numeral x on a scale can be replaced by another numeral x' where x' 1s the function of x listed in 
column 3 (from Stevens, 1959, p 24). 


Stevens thus.does recognize the distinction between properties of the formal system 
(numbers) and the empirical system inasmuch as he allows for less than the complete range 
of properties of the number system to be used to represent a given empirical system. 
However, he still retains the assumption, as can be seen by Table 1, that the properties of 
the two systems must be isomorphic, that is, an empirical operation can only lead to a 
scale possessing the same relational properties. It can now be shown that this is not in fact 
the case, and that, for example, empirical observations of order can lead to interval type 


Measurement in psychology 27 


scales. It was because of this isomorphism assumption that Stevens did not see the necessity 
to consider properties of empirical observations as independent of the scale type. For him 
one was completely subsumed by the other. In many respects the Stevens-tradition (ST) 
measurement theory does not represent any real improvement over Campbell’s approach, 
particularly in terms of distinguishing the empirical and formal systems. The emphasis is 
simply shifted from one extreme to the other — whereas Campbell regarded empirical 
relations as paramount, Stevens emphasizes scale properties as containing all the important 
information. 

This in fact represents the major limitations of both of these approaches. Stevens, like 
Campbell, treats the relation between the empirical and formal systems as axiomatic, a 
necessary condition for measurement to proceed, and something which must therefore be 
present if measurement has been achieved. Yet clearly identifying the precise nature of the 
relationship between scale values and empirical entities is a prerequisite, not a consequence, 
of choosing a particular scale type. 


Axiomatic approaches to measurement theory 


Probably the major difference that characterizes the axiomatic approach to measurement is 
that it explicitly recognizes the role of theory in measurement. Measurement is seen as 
being an integral part of the theory, rather than simply a problem to be overcome prior to 
theory construction. Even in physics, the measurement of intensive properties, such as 
temperature, only arises as a by-product of theory. And in practice theory is often 
necessary for the measurement of extensive properties, for example, for lengths too long to 
measure with a ruler or masses too great to place on scales. The measurement can only be 
accurate to the extent that the theory is accurate. For example, the measurement of 
distance by applied trigonometry depends on the theory of plane Euclidean geometry, but 
this fails and must give way to spherical geometry when large distances on the earth's 
surface are involved. 

Measurement then is seen as the construction of a model of some property of the world. 
Like all modelling it involves the establishment of a correspondence between an empirical 
relational system (the world) and a formal relational system (the model), so that one can be 
said to represent the other. If the model is numerical, that is, if arithmetic is being used as 
the formal system then the representation is called measurement. 

There are two major components in the formal system (arithmetic) for which we must 
establish a correspondence in a given empirical system for it to be measured; the elements 
of the system (the numbers) and the relations that are assumed to hold between them. 
Unlike the early notion of quantity, the arithmetic relations that are assumed to hold may 
be restricted to some formally defined subset of the total range, such as those of the form 


a $ b, 
or |a—b| * |c— dl, 
or a/b $ c, and so on. 


The requirements on the actual behaviour observations that we experience is simply that 
they be classified and structured as objects and relations. This is 1n fact the most basic and 
theoretically neutral level on which we can record observations within the empirical 
tradition. We can therefore, express the theoretical assumptions on which measurability 1s 
based entirely in terms of the conditions under which this empirical relational system can 
be represented by a formal relational system. 

In formal terms a relational system a = <A, R»; where A is a set of objects and Risa 
relation defined on those objects is said to be represented by another relational system 
B = KB, S> if there exists a function f from A into B such that for all x, y in A 


xRy => If(x) Sf(y). 
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That is, the relation S holds between f(x) and f(y) if the relation R holds between x and y. 
If both a represents # and f represents a the two systems are said to be isomorphic. 

The essence of numerical representation is thus the assignment of numbers to objects in 
such a way that the observed relations among objects are reflected by corresponding 
relations among the numbers assigned to them. To determine under what conditions this is 
possible one must refer to the representation theorem for the measurement model. The 
measurement model specifies the type of representation that is being produced by a given 
method of assigning numbers; that is, it specifies what relations defined on the number 
system are assumed to represent relations holding in the empirical system. The 
representation theorem specifies a number of axioms which must be shown to be satisfied 
in order for it to be possible to produce a numerical assignment to a given numerical 
system that provides a representation of the type defined by the measurement model. 

For example, one of the simplest measurement models is where one of the ordering 
relations on the number system (i.e. > or <) is assumed to represent a given empirical 
relation (e.g. asking a subject to choose which is the heaviest of two weights, or which is 
his preferred choice of two alternatives). 

It can be shown that provided the number of alternatives is finite, transitivity of the 
empirical relation is a sufficient condition for its representation by the relation ‘greater 
than' on real numbers. 

Conditions for measurability on a particular scale are thus largely specified in terms of 
assumptions which are easy to subject to empirical test. For example, in this case one needs 
to verify that for every triple of objects x, y, z the empirical relation R satisfies 


xRy and y Rz => xRz. 


There are sometimes also axioms known as existential axioms which are less easy to test 
empirically. These postulate the existence of elements with certain specified properties. They 
are basically technical axioms less likely to be of importance empirically, and it should be 
clear from experience if they do not hold. 


Applications of measurement theory in psychology 
Classification of scale type 


Under classical measurement theory scales of measurement are classified by the range of 
mathematical transformations that leaves the ‘scale forms’ invariant. No other 
considerations are regarded as relevant to the determination of what information is 
contained in scale values. A number of inconsistencies however become apparent with this 
approach. For example, Stevens argues that the ranking (nominal, ordinal, interval, ratio) 
represents increasing scale strength and therefore desirability. However, counting, as 
Stevens agrees, is a ratio scale. Many psychological measures are in fact frequency counts, 
e.g. number of trials, bar presses, words recalled, etc. Yet in spite of this Stevens still insists 
that ‘most of the scales used widely by psychologists are ordinal scales’ (1951, p. 26). IQ, 
for example, is an ordinal scale according to Stevens. Yet, by his definition it could be 
converted to a ratio scale by reverting to counting questions. Clearly something else is 
being taken into account, yet it is not specified what it is. Ellis (1966, p. 63) concludes that 
— Stevens ‘...simply makes the classification as though the reason for it were self evident’, 
and Nunnally (1967, p. 21) that '... no-one has made it clear what types of evidence would 
justify the assumption of a particular scale type’. 

The unstated considerations are clearly the theoretical assumptions regarding the nature 
of the relationship between the observed events and the property being measured. While 
frequency of bar pressing can be observed and measured on a ratio scale, if it is ‘amount of 
learning' that is being measured this can only be in an associative sense via the assumption 
the ‘more the task is learnt the faster the rate of bar pressing’. 


Measurement in psychology 29 


Stevens’ position was that it was unnecessary to make theoretical assumptions relating 
observations to theoretical quantities explicit at any stage. He argued that the principle of 
scale invariance was sufficient to determine what information was contained in scale values. 
He states (1951, p. 29) ‘We may seek the final and definitive answer in the mathematical 
group structure of the scale form, in what ways can we transform its values and still have it 
serve all the functions previously fulfilled.’ Ellis (1966), however, demonstrates that an 
inadmissible transformation of a scale can still serve all the functions and retain all the 
information of the old scale. It will change the nature of the relationships with other scales, 
but since Stevens does not include any assumptions regarding the ‘meaning’ of the scale in 
his system this cannot provide a check on which of the two scale forms is ‘correct’. 
Ellis gives as an example the application of a log transformation to the Kelvin scale of 
temperature. The transformation T’ = Klog T is clearly inadmissible, yet as Ellis states 
.. what purposes, could not be served by such a scale (T’)? What information do we 
fail to give if we give our measurements in (7’)?...What predictions could we not 
make? What laws could we not express?...Could we not calculate the resulting 
temperature in a calorimetric mixing experiment? Certainly. The method of calculation 
would not be the usual one. But what does that matter? Could we, using such a scale, 
have discovered the (ideal) gas law?.. Certainly. The form of this law would not be 
the usual one. It would be found empirically that [pV = ReT7', where e is the kth root 
of the base used in the logarithm] (instead of pV — RT). But again, what does this 
matter? Finally, could we explain such a law on the kinetic theory? Yes, of course we 
could. We should have to assume that temperature is log (K. E.), instead of K. E. But 
what would be wrong with that? Our only reason for assuming that the temperature of 
a substance is the average kinetic energy of its molecules is that it enables us to explain 
the temperature laws that we have discovered using our ordinary temperature scales 
(Ellis, 1966, pp. 60-61). 
Other examples have also been given, for inanis, Krantz et al. (1971, pp. 11-12) point 
out that one could employ a variety of alternative scales for length related to the 
conventional one by transformations like L’ = L?, or L' = e}. This would mean that the 
mathematical operation corresponding to measuring the length of two adjoining lengths 
would be changed to, in the first case. 

Ly = (VL, +v Ly) 
and in the second ` 

Lr = Lile 
This problem is avoided under axiomatic measurement theory as scale classification is 
directly expressed by the types of arithmetic relations that are being used to represent the 
empirical system. There are therefore, as many scale types as there are sets of arithmetic 
relations. This system of scale classification can, however, also be expressed in terms of 
admissible transformations; ‘in what ways can the numbers assigned be changed without 
affecting their measurement properties?’. The distinction, however, is that there is now a 
formal statement of the meaning of the scale which enables the criterion of invariance to be 
defined. That is, it is a question of whether the transformed scale values can still serve as a 
representation of the empirical relational system; ‘do they still obey the same measurement 
model?’ For example, for the measurement model described previously the set of numbers 
need only obey the requirement that f(x) > f(y) iffxRy. Thus, any transformation of scale 
values that preserves their order yields another admissible scale. The scale is thus defined as 
being unique up to an order preserving transformation, or in Stevens' terms, an ordinal 
scale. 

The transformational invariance Iomulaond is sila as it serves to identify the degree of 
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uniqueness of a scale, or conversely, which sets of scales are equivalent. The logical basis of 
the inconsistencies in Stevens' approach is now apparent. The system is based on a 
formulation which in theory can determine scale types, but in fact can do no more than 
identify relations with other scales. A scale cannot be an ordinal or a ratio scale in any 
absolute sense, but can only bear an ordinal or ratio relationship with another scale. 


Extended Stevens-tradition measurement theory 


Prytulak (1975) has suggested an extension of the ST approach that recognizes the logical 
limitations of the transformational invariance principle. He suggests that the choice of scale 
type should be based on empirically determined relations with other scales. To support this 
notion he observes that the procedures used by ST theorists to identify scale type do in fact 
implicitly, if not explicitly, recognize the fact that it is based on transformations into other 
scales. They avoid the direct recognition of the second scale by using examples such as 
subjective length where some familiar scale is implicitly present (i.e, physical length) and so 
the reader can readily appreciate the relation between the scales without any overt 
recognition that a transformation is involved. 

So in fact, as Prytulak observes, the actual choice of scale type results from a 
consideration of all such implicit combinations that are relevant to the use of the scale, and 
involves such intuitive criteria as the relative frequency, importance and range of situations 
in which transformations of the higher and more desirable types occur. 

Prytulak argues that this process should be made explicit with a clear statement of how 
these (and other) criteria should be combined into an unequivocal classification criterion. 

There is undoubtedly a role for this extended version of ST measurement theory in 
dealing with the psychological measurement procedures that draw their justification from 
established relations with other scales. Many of the measures commonly used in 
psychology depend on associative assumptions, such as that relating an operational 
measure with its corresponding theoretical concept. 

This requirement is of course not unique to the measurement of psychological attributes; 
it is precisely the same type of assumption that needs to be made in associative 
measurement of physical properties, such as temperature by the expansion of mercury. 
However, with psychological attributes the relationship is often less direct and may involve 
probabilistic inferences such as ‘checking this category is more likely with this attitude than 
that one’. A measure that is even more likely to be correlated with the attitude, etc., can 
thus be obtained by adding scores over a number of such items. This is, however, not 
associative measurement in the sense defined by Ellis as one-to-one relationship between 
observed and theoretical quantities. Dawes (1972) uses the term index measurement and 
defines it as occurring ‘whenever a property of the thing being indexed determines a 
corresponding index and not vice versa’. 

This definition is quite neutral with respect to the theoretical rationale for including 
items in the index (and thus being able to predict from the index the way any future item 
would be scored). The theoretical assumptions are contained instead in predicted relations 
between the index and other scales, and are therefore embodied in the concept of validity. 
While it 1s possible to use interrelationships among attributes to define them it seems 
preferable to refer constructs to actual behaviour and define index measurement along the 
lines described above (i.e. as a summation of behavioural events, each of which is 
probabilistically related to the property being measured). In practice of course theory 
guides both stages of index measurement; the initial selection of items and refinements to 
these as a result of validation against other measures. 

The choice of scale categorization for such index measures however, since they cannot be 
referred in a simple manner to empirical observations, can only be derived in the manner 
Prytulak suggests, in terms of their relations with other scales. 
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Table 2. Examples of statistical measures appropriate to measurements made on the 
various classes of scales 











Measures of Association or Significance 
Scale location Dispersion correlation tests 
Nominal Mode Information, H Information Chi square 
transmitted, T 
Contingency 
correlation 
Ordinal Median Percentiles Rank order Sign test 
correlation Run test 
Interval Arithmetic Standard Product-moment t test 
mean deviation correlation F test 
Average Correlation 
deviation » ratio 
Ratio Geometric Per cent 
mean variation 
Harmonic 
mean 





From Stevens, 1959, p. 26. 


Selection of appropriate statistics 


ST measurement principles also lead to a misleading impression about the relation of 
statistics and measurement. Stevens' position is that the measurement scale entirely 
determines what types of statistical analyses are appropriate (see Table 2) and this is 
reflected by many popularly used statistics handbooks in the recommended criteria for 
choosing a particular statistical test (e.g. Siegel, 1956). 

Now one cannot say that measurement scales have no relation to statistics, although this 
proposition was actually put forward by some statisticians (e.g. Burke, 1953) and 
exemplified humourously by Lord's (1953) article *On the statistical treatment of football 
numbers'. This view was based on the notion that statistics operates on numbers and one 
does not need to consider what the numbers represent. Thus, Anderson (1961, p. 309) 
writes: 

the statistical text can hardly be cognizant of the empirical meaning of the numbers 
with which it deals. Consequently the validity of the statistical inference cannot depend 
on the type (or nontype) of measurement scale used. 
Stevens' response to this was: 
However much we agree that the statistical test cannot be cognizant of the empirical 
meaning of the numbers, the same privilege of ignorance can scarcely be extended to 
experimenters (1968, p. 849). 
It is true that statistical texts are based on assumptions about the distribution of numbers 
and test hypotheses about these numbers, so, as Hays (1963, p. 74) writes: 
If the statistical method involves the procedures of arithmetic used on numerical scores 
(however obtained) then the numerical answer is formally correct. 

However, while the empirical meaning of numbers may not affect the validity of 
manipulations performed on them it clearly does affect the conclusions drawn from the 
results of these manipulations. Thus Stevens' point is valid that the inferences drawn from 
statistical techniques may be misleading if they are based on properties of the number 
system which are not in fact being assumed. One should clearly not use statistics which 
could be distorted by admissible transformations of the scale values. However, this 1s still 
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only part of the story, as the scale of measurement does not in itself determine what 
empirical relations in the data are being assumed. If, for example, one is using muscle 
tension, skin conductance, cigarettes smoked or any other such measure as an index of 
anxiety, then even though one might be able to measure very precise values on this (ratio 
scale) operational measure, the fact that inferences are being made about anxiety would 
justify the use of a statistical test using only the ordinal information in the data. 


Meaning of scales 


In the strictest sense the meaning of a scale should consist simply of a precise description of 
the empirical relation used to describe it — for example, a scale derived from subjective 
orderings of pairs of tones presented sequentially or from preferential choices from pairs of 
objects. One may wish to make the further inference that these are scales of subjective 
loudness or subjective utility, etc. To do this one needs to show that empirical relations of 
different types but related to the same property, yield the same type of scale.* In fact, they 
often do not, with different scale forms resulting from different methods of data collection. 

In many cases the problem stems from the use of measurement procedures without an 
explicit measurement model. For example, Stevens (1951) recommends the use of methods 
involving direct numerical assignment by subjects. Subjects may be asked to assign 
numbers to objects, to the differences between objects, to the ratio of objects, or even the 
ratio of differences between objects. Clearly this can only be held to constitute 
measurement if subjects can indeed follow these sorts of instructions in a consistent and 
unbiased manner. 

The assumption that subjects can make behavioural decisions based on numerical values 
(or even arithmetic operations on numerical values) on subjective scales is much stronger 
than that made by any measurement model where, at most, orderings of differences are 
required. 

There is, in fact, much recent evidence to show that subjects! use of the number scale is 
not a direct reflection of the underlying metric properties of their subjective scales (Curtis et 
al., 1968; Curtis & Fox, 1969; Curtis, 1970; Rule et al., 1970). Krantz (1972) and 
Wagenaar (1975) suggest ways that experiments involving direct numerical estimation can 
be related to a measurement model by assuming that the assignment of numbers is in fact a 
second-stage process following the relational Judgement between physical stimuli. 

Index measurement is another method of numerical assignment not based on an explicit 
measurement model. For measures such as IQ the uniqueness problem is not well defined 
and thus statements involving, for example, the comparison of averages must contain an 
element of ambiguity. Since such measures can only be defined in terms of imperfect 
correlations with other measures, some of which may also be only index measures, the 
question of validity of measurement is often circular. For example, IQ and scholastic 
aptitude to a large extent can only be defined in terms of each other. While such measures 
undoubtedly can serve a useful purpose their inherent vagueness and imprecision means 
one should be cautious in using them 1n inferential statements going beyond simply 
prediction (e.g. comparisons of intelligence between races, sexes, etc.). One would hope that 
as the science of psychology develops such index measures could be dispensed with in 
favour of scales directly derived from assumptions about underlying psychological 
processes. 


* This 1s equivalent to assuming some underlying reality to the scale and suggests the use of rules such as the 
Maxwell-Bridgemann criterion for physical reality - no quantity 1s real unless it is measurable by at least two 
logical independent procedures (Bridgemann, 1927) 

Ellis (1966) regards this as unnecessarily restrictive and argues that 1t 1s completely a matter of choice whether 
we regard a particular linear order as being of sufficient interest to denote it a quantity. 


Conclusion 
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This review has attempted to demonstrate that many of the commonly accepted ideas 
regarding psychological measurement and statistical analysis are resting on an incomplete 
logical foundation. While the extension of the classical approach to accommodate the 
relative nature of the use of transformations to identify scales does remove this logical 
inconsistency, this still leaves open the question of scale identification. The use of any 
measurement procedure where a specific measurement model has not been made explicit 
must necessarily produce a scale whose precise identification is to some extent ambiguous. 
The resolution of this uncertainty can then only depend on the extent to which theoretical 
or empirical relations can be established with other scales. 

This is not to say that psychological measurement procedures in current use are invalid; 
simply that the inferences drawn from them must reflect this uncertainty of definition. 
When an index measure such as a semantic differential, Likert scale or IQ score is used as 
a dependent variable in a research programme, and a significantly different variation 1s 
obtained, there can be no quarrel that a qualitative causal effect of the experimental 
conditions has been observed. However, if psychology is ever to become a quantitative 
science then some meaning beyond simple statistical variation must be attached to scale 
values. The axiomatic approach to measurement theory offers much more compelling 
theoretical advantages. As Cliff (1973) observed in a review of scaling procedures: 

The recent achievements in measurement theory provide hardly less than the basis for 
a revolution in the definition of psychological variables (1973, p. 477). 
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Sex differences in imagery vividness: An artifact of the test 


Roderick Ashton and Kenneth D. White 








When scores from a modified Sheehan/Betts imagery questionnaire were reanalysed, previous 
findings — that females report more vivid imagery than males — were not confirmed. Because the 
modification affected only response bias, through changes in questionnaire format, it 1s argued that 
previously reported sex differences may be artifactual in so far as imagery vividness scores are 
concerned. 


Using Sheehan’s (1967) revision of Betts’ (1909) Questionnaire upon Mental Imagery 
(QMI) we found (White, Ashton & Brown, 1977) that women report significantly more 
vivid imagery than do men. As the standard QMI may be influenced by acquiescence or 
response set factors (DiVesta et al., 1971; White et al., 1974) the locus of these differences 
is open to question. That is, they may result from the operation of sex-linked factors other 
than imagery ability. Resolution of this problem would have important consequences for 
the vexed problem of the validity of imagery questionnaires (Ernest, 1977; Marks, 1977; 
White, Sheehan & Ashton, 1977). 

The present study re-examines the locus issue using a questionnaire format designed to 
minimize the influence of response set factors. This version contains the same questions as 
the QMI, but the format has been standardized and individual items are presented in a 
random order. The factor structure that resulted from analysis of this test was simpler, 
clearly interpretable in terms of known physiological groupings of sense modalities, and 
appeared to be less contaminated by questionnaire or response bias than the standard 
version (see White et al., 1978). 


Method 
Subjects 


The subjects who responded to the standard version of the Betts’ QMI have been described 
previously (White et al., 1977). To summarize, there were 1385 female students and 829 males In the 
case of the response to our random version of the QMI, the subject pool consisted of 273 female and 
153 male students. All groups of students were taking Introductory Psychology courses, were test 
unsophisticated at the time of testing, and were from equivalent university populations. 


Test description 


The standard version of the QMI (Sheehan, 1967) is reproduced by Richardson (1969) and Hilgard 
(1970) and is described in detail in our previously cited works. The new random version was 
constructed by rewording each of the 35 questions to achieve consistent format. These reworded 
questions were then randomly assorted with the restriction being imposed that adjoming items did 
not elicit responses from the same sensory modality. White et al. (1978) should be consulted for fuller 
details, but as an illustration, the first two questions on the standard version were. 

Think of some relative or friend whom you frequently see, considering carefully the picture that 

rises before your mind's eye. Classify the images suggested by each of the following questions as 

indicated by the degrees of clearness and vividness specified on the Rating Scale. 


Item Rating 
1. The exact contour of face, head, shoulders 

and body ( ) 
2. Characteristic poses of head, attitudes 

of body, etc. ( ) 
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For the random version the first two questions were: 


Item Rating 
Think of: ear 
1. Feeling, the warmth of a tepid bath ( ) 
2. Seeing for a relative or friend, the different 
colours worn in some familiar costume ( ) 


The method used to rate vividness was a seven-point scale from 1 ‘Perfectly clear and vivid’ to 7 ‘No 
image at all’ and was the same for both versions, as were the preliminary printed instructions. 


Procedure 


The tests were contained in a booklet of other unrelated questionnaires and were administered to 
students during class time. 


Results 


Modality and total-test scores were calculated for the new random version of this test. 
These are presented in Table 1 together with the values obtained from the original version 
of the test. The differences between the mean scores on each of the two tests are given in 
Table 2. Also included in that table are the values of Student's t associated with each 
difference score. 

To summarize in words the results tabulated, on the older version of the QMI women 
reported significantly more intense or vivid imagery overall and in all sensory modalities 
except audition. On the random version, however, the only significant sex differences were 
in the visual and organic modality vividness ratings. Comparing the random test ratings 


Table 1. Imagery vividness scores on the random and original versions of the QMI 


Male Female 

Modality Random Original Random Original 
Visual 

X 15:01 12-60 13-82 11-40 

SD 5:49 4:20 5-66 3-90 
Auditory 

X 13:55 13-00 14:51 13:40 

SD 4.92 4-80 4-98 4-70 
Cutaneous f 

X 15-62 14-40 15-07 12-60 

SD 452 5-10 4-91 4-60 
Kinaesthetic 

X 13:41 12-60 13:41 12-20 

SD 4-55 5-70 4-76 4-50 
Gustatory 

X 15-66 14-20 15-50 13-70 

SD 5:04 5:20 5-46 5-30 
Olfactory 

X 17:41 15-80 17-48 14 90 

SD 5-62 5:80 5 67 5-80 
Organic 

X 14-71 12:80 13-74 11-90 

SD 4-99 4-60 4-65 4-60 
Total score 

Xx 105-38 95-40 103-64 90-20 


SD 25-12 24-80 24 08 24-00 
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\ 
Table 2, Mean differences between rated imagery scores on the old and randomized 
versions of Betts’ QMI 


(a) Mean differences on the two tests 











Male Female 

Modality Difference t value (d.f. = 979) Difference t value (d.f. = 1656) 
Visual 2-41 5.15** 3.42 6-76** 
Auditory 0-55 1:26 1-11 3-40** 
Cutaneous 1-22 2:99** 2:47 7.67** 
Kinaesthetic 0-81 1-95 1:21 3 89** 
Gustatory 1-46 3.28** 1:80 5:28** 
Olfactory 1-61 3.24** 2:58 6:85** 
Organic 1-91 4:39** 1:84 5.99** 

Total scores 9-98 4. 5]** 13-44 8-43** 








(b) Mean differences between scores of the men and women 

















Random version Original version 
Modality Difference z value (d.f. = 423) Difference — t value (d f. = 2212) 

Visual 1-19 2:12* 1:20 6-68** 
Auditory 0-96 1-93 —0-40 1-91 
Cutaneous 0-55 1:16 1-80 8 25** 
Kinaesthetic 00 00 0-40 1-97* 
Gustatory 0-06 0 12 0-50 2-15* 
Olfactory —0-07 0-12 0-90 3-52 
Organic 0-97 1-97* 0-90 4-44** 

Total scores 1-74 0-69 5:20 4-85** 











* P<0-05,** P «001. 


with those given on the original test all of them changed significantly except for the men's 
scores in the auditory and kinaesthetic modalities. Overall, however, the women showed 
larger differences on all comparisons except for the organic modality scores. The size of 
these rating changes demonstrate that it was the women who changed their scores the most 
between versions of the test. 


Discussion 


The present results go some considerable way towards answering the question posed above 
concerning the nature of reported sex differences in 1magery vividness. We have 
demonstrated that such differences may be decreased to a significant extent by the simple 
expedient of reducing the effect of questionnaire structure factors. Thus to a test whose 
format reduced such factors, men and women rate the vividness of their evoked images as 
being about equal. Furthermore, it was the scores of female respondents which dropped the 
most when such factors were reduced. We may state, therefore, that previously reported sex 
differences in rated imagery vividness probably reflected the fact that the women 
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overestimated the intensity of the images in response to some demand characteristic of the 
test. 

One point must, however, be stressed and that is that although the present results cast 
doubt upon reported sex differences in imagery vividness, they still support the notion that 
some type of sex differences exist. After all, females do report more vivid images when 
responding to items on the original QMI. The present data suggest, however, that they are 
giving high ratings in response to an instrument factor rather than to the evoked image per 
se. It is obvious, therefore, that this instrument factor interacts strongly with the sex of the 
respondent, influencing female testees more than males and, to a lesser extent, with the 
sensory modality being tapped. This last qualification is needed to account for our finding 
that there were still sex differences in the case of the visual and organic modalities. Such 
qualifications do not, however, pose much of a problem for future research, because total 
scores on the QMI are usually used to define subject characteristics rather than separate 
modality scores. 

In conclusion, the present results strongly suggest that total scores on the random 
version of the QMI should be used to define imagery ability in order to minimize possible 
contamination of results with an instrument factor; scores derived from the administration 
of the original QMI being too contaminated. 


References 
Betts, G. H. (1909) The Distribution and Functions of | White, K. D., ASHTON, R. & Brown, R. M. D (1977) 
Mental Imagery. New York: Teachers College, The measurement of imagery vividness Normative 
Columbia University. data and their relationship to sex, age, and modality 
DiVesta, F. J., INGERSOLL, G & SUNSHINE, P (1971). differences. British Journal of Psychology, 68, 
A factor analysis of imagery tests. Journal of Verbal 203-211. 
Learning and Verbal Behavior, 10, 471-479. WHITE, K. D., ASHTON, R & Law, H. (1974). Factor 
Ernest, C. H. (1977). Imagery ability and cognition. A analysis of the shortened from of Betts’ 
critical review Journal of Mental Imagery, 2, Questionnaire Upon Mental Imagery. Australian 
181-216. Journal of Psychology 26, 183-190. 
HILGarD, J. R. (1970). Personality and Hypnosis. . WHT; K. D., ASHTON, R. & Law, H. (1978). The 
Chicago University of Chicago Press measurement of imagery vividness: Effects of format 
Manxs, D. (1977). Imagery and consciousness: A and order on the Betts’ Questionnaire Upon Mental 
theoretical review from an individual differences Imagery. Canadian Journal of Behavioral Science, 10, 
perspective. Journal of Mental Imagery, 2, 275-290 68-78. 
RICHARDSON, A (1969) Mental Imagery. London: Wnre, K. D., SHEEHAN, P. W & AsHTON, R (1977). 
Routledge & Kegan Paul Imagery assessment: A survey of self-report 
SHEEHAN, P W. (1967). A shortened form of Betts’ measures. Journal of Mental Imagery, 1, 145-170 


Questionnaire upon Mental Imagery. Journal of 
Clinical Psychology, 23, 386-389. 


Received 23 May 1978, revised version received 20 September 1978 


Requests for reprints should be addressed to R. Ashton, Department of Psychology, University of Queensland, St 
Lucia, Queensland, Australia 4067. 
K. White is at the same address. 


British Journal of Psychology (1980), 71, 39-42 Printed in Great Britain 39 


The golden section relation in the evaluation of 
environmental factors 


Benjamin Shalit 








Two hundred and thirty-two subjects, coming from different civilian and military environments, were 
asked to evaluate factors which they percerved to be typical for their environment. Evaluation of 
these factors was positive, negative or neutral. The golden section hypothesis was tested by the ratio 
of positive to positive and negative evaluations and was not upheld. However calculating the ratio of 
positive to positive, neutral and negative evaluations produced a ratio of 0-62, thus confirming the 
golden section hypothesis. Results were discussed in terms of judgemental style, modes of appraisal 
and possible effect on coping. 





The ratio of positive judgements (P) to positive and negative (N) judgements in aesthetic as 
well as interpersonal evaluation has been shown by Benjafield & Adams-Webber (1976) 
and Benjafield & Green (1978) to approximate to the golden section, i.e. P/(P+N) ~ 0-62. 
Benjafield & Green (1978) argue that there 1s a general tendency for people to organize 
their judgements in such a way as to make the negative events maximally striking by 
contrast with the positive events. Typical and atypical events are expected to be 
perceptually organized in the golden section, much in the same way as are positive and 
negative events. In their paper, the authors obtain confirmation for their hypothesis by 
using interpersonal judgements derived from a fixed size, ordered universe, of 
acquaintances. The authors structure this universe rather artificially, and one might ask 
whether the golden ratio would hold true even if judgements were to be made in a much 
less clearly bound and ordered universe. 

We carried out an investigation relating perceptual organization, ambiguity and coping. 
As part of this investigation subjects were asked to list those factors which they considered 
to be typical for the particular environment studied, e.g. their work or their military 
service. They were also requested to evaluate these factors as positive or negative, and it is 
the ratio of these evaluations, as applied to the factors chosen by the subjects themselves, 
which we used to test the hypothesis of the golden section. 


Method 
Subjects 


Six different groups were investigated: 

(1) One hundred and thirteen Swedish National Service men, doing compulsory military service, 
aged 18—20. These were in six groups, representing six different infantry platoons, numbering 16, 19, 
22, 19, 21 and 16 soldiers. Results for each of the groups are presented separately. 

(2) Thirty-five professional soldiers, senior NCOs and officers up to the rank of Lieutenant. Ages 
approximately 25-39. These came from different companies (10, 10, 9 and 6) and results will be 
presented separately for each group. They will be referred to as ‘company command group’ 

(3) Seventeen professional soldiers: officers ranking from Captain to Lieutenant Colonel, ages 
approximately 30—45. They represented several arms of the services, and will be referred to as 'senior 
officer group’. 

(4) Fifteen chief air traffic controllers (civilian and military), all males, from various Swedish 
airfields. Ages approximately 35-55. 

(5) Seven male and 10 female social scientists, all working within a research institute. Ages 20-46. 

(6) Thirty-five female crane operators, working at a steel foundry. Ages 22-45. 
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Scoring 

All subjects were asked: ‘What are the factors which you feel characterize or are typical of your work 
(or your military service)?’ Although this was given as an open-ended question, for technical reasons 
a limit of 12 answers (factors) was imposed. In practice we found that only about 2 per cent of any 
sample gave 12 responses. 

Once the factors are listed, subjects were asked to indicate whether they found each factor to be 
very pleasing or attractive by marking it with ++; pleasing or attractive by marking it with +, 
displeasing or unattractive by marking it with — ; very displeasing or unattractive by marking it with 
— —, Any factor which was neither pleasing nor displeasing was to be marked with a zero. 

The golden section ratio was calculated as the ratio of + signs to the total of + and — signs, and 
the mean score for each group determined. 


Results and discussion 


The ratios of (+) to (+ & —) signs for each group are given in the first column of Table 1. 
The obtained ratio of z 0-8 does not uphold the golden section hypothesis. These ratios, 
for each group, are remarkably constant, but notably higher than those predicted by the 
golden section hypothesis. It would appear that the proportion of negative evaluations is 
consistently smaller than predicted, or, to adopt Benjafield & Green's (1978) approach - 
the typical, or positive, aspects are overrepresented. 


Table 1. Ratio of evaluations of their environment by different groups 


Ratio 
Group n +/+ &— +/4+&0&— 
National servicemen 
l 16 0-85 0-59 
2 19 0 85 0-60 
3 22 0-87 0-64 
4 19 0-83 0:59 
5 21 0-82 0-67 4 
6 i 16 0:89 0-62 
Company command 
1 10 0:82 0-63 
2 9 0-81 0-62 
3 10 0-82 0-58 
4 6 0-70 0:61 
Senior officers 17 0-82 0-61 
Air traffic controllers 15 0 88 0-62 
Social scientists 17 0-57 0-61 
Crane operators 35 0-77 0-63 
Total 232 
Mean 0 796 0:616 
SD 0-089 0-023 








However the golden section hypothesis as stated by these authors relates to people who 
‘differentiate one thing into two’ (not necessarily positive and negative). We have ignored 
this fact in our initial analysis, and have not considered the fact that our subjects divided 
their evaluations into three categories (positive, negative and neutral). We therefore 
calculated the ratio again, this time putting negative and zero responses in the same 
category. That is, the dichotomous division was into positive aspects of the environment, 
and those aspects which were not positive. The results of the calculation of this ratio are 
presented in the second column of Table 1, marked +/+ & — and 0. 
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The grand mean ratio ~ 0-62, which clearly upholds the golden section hypothesis. It 
would appear that the positive aspects of the perceived environment offer a focal point for 
perceptual differentiation, and offer a basis on which the world can be dichotomized, 
classifying all other judgements — whether neutral or negative — into a separate category. 

The stability of the golden ratio across very different groups, relating to very different 
environments, raises the question whether we are geaiing with some artifact or a basic 
perceptual processing factor. 

A person’s judgement may be shaped by factors irrelevant to the context and content of 
his judgement, e.g. the graphic form of presentation of the scales (Shalit, 1974), or the 
number of categories used (McKelvie, 1978). However, as Kelly (1955) stated, every 
individual interprets himself and his surrounding psychological and physical environment 
in accordance with his own system of personal constructs. Thus, assuming a relative 
stability of personal constructs, one can expect a relative stability of a person’s judgemental 
styles. 


Table 2. Distribution of golden section ratio score for individual subjects 


SD from mean n Per cent 


+25 25 11 
+20 4 2 
+15 4 2 
+10 90 39 
—1-0 86 37 
—L5 3 1 
—20 l 
—2:5 19 8 
Total 232 100 


Benjafield & Adams-Webber (1975) have shown that individual P/P -- N ratios differ 
greatly, as does cognitive style. We were therefore interested in observing how great were 
the individual variations from the 0-62 point in the 232 individuals studied. The pattern of 
such variation might indicate whether all individuals followed a similar perceptual process 
or whether marked, and possibly systematic differences occurred. The distribution of the 
individual responses, expressed in terms of standard deviations from the mean (i.e. the 
golden section) is presented in Table 2, which reveals that 176 (76 per cent) of the subjects 
show the perceptual style characterized by the golden section — falling within one standard 
deviation from it, while 44 persons (19 per cent) show à marked deviation of 21 standard 
deviations. Twenty-five of the latter are heavily biased towards positive judgements, while 
19 are biased towards negative judgements. Only 12 subjects (5 per cent), fall in the 
intermediate range between 1 and 2] standard deviations. 

It would appear reasonable to describe the population as falling into three groups, 
representing three different perceptual styles: The ‘golden section’ group, which behaves 
according to the hypothesis we set out to investigate, the ‘optimist’ group and the 
‘pessimist’ group, each tending to see the world in more extreme, positive or negative 
terms. It is rather tempting to think that one may be able to characterize the perceptual 
style relating to environmental evaluation along a continuum, much like the continuum of 
perceptual style relating to field dependence- independence as described by Witkin et al. 
(1962). If this were indeed the case one would sometimes have to establish the individual's 
‘judgemental style’ in these terms in order to correct for his predisposition to evaluate the 
environment in one way or another. Our data do not permit us to say whether the same 
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individual would display the same type of judgemental style when relating to different 
environments, but it is clear that no environment was linked with one specific style of 
response. 

As indicated by Lazarus et al. (1974) appraisal is an essential part of the coping process. 
By ‘appraisal’ these authors mean the evaluation of an environment as harmful or 
beneficial — positive or negative. Thus a judgemental style which predisposes an individual 
towards a certain mode of appraisal might be of great significance in determining his 
general coping capacity and behavioural responses. 

The perception of ambiguity, or the appraisal of a stimulus as being ambiguous or 
non-ambiguous, and thus potentially threatening or innocuous, has been shown by Shalit 
(1977) to be related to coping. Again, such appraisal, much like the appraisal of the 
probability for the occurrence of an event (Eiser & Eiser, 1975), might turn out to be 
determined to a great extent by judgemental style, rather than by the more relevant 
contextual factors involved. 

Benjafield & Green (1978) referred to organizing typical and atypical events according to 
the golden section, and we have shown that for some there appears to be such an excess of 
‘typical’ positive or negative evaluations, so that they cannot balance round that point. A 
strong predisposition to balance and organize perceptions round a stable point, or a golden 
mean, might offer some coping and adaptation advantages under some conditions. On the 
other hand it is likely to delay and retard any ability to detect any deviations and 
alterations in the ‘normal’ world — and thus delay and affect sound appraisal and, 
consequently, sound coping. Therefore it would be of interest to study both the stability of 
the judgemental style of an individual across situations, and the rigidity of his style in the 
face of altering conditions in a given environment. 
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A bilingual word-length effect: Implications for intelligence testing and the 
relative ease of mental calculation in Welsh and English 


N. C. Ellis and R. A. Hennelly 





Five experiments are reported. These demonstrate that, in bilingual subjects, Welsh digits take longer 
to articulate than their English equivalents, and this difference 1s paralleled by the finding that digit 
span in Welsh is significantly smaller than that in English. These differences are attributable to 
bilingual word-length differences, and it is this, rather than intellectual differences, which explains 
why the norms for Welsh children on the digit span test of the Welsh Children’s Intelligence Scale are 
reliably less than those for the same age American children tested on the similar digit span procedure 
of the Wechsler Intelligence Scale for Children. These findings lead to the prediction that mental 
calculation in the Welsh language will be more difficult than that in English. 

An interaction between translation and storage in working memory is demonstrated. This finding 
accords with the working memory formalization of Baddeley & Hitch (1974). It is shown that 
translation towards the language of preference is faster than that in the reverse direction 








Individual differences in the span of immediate memory, as measured using strings of 
random digits as stimuli, have commonly been utilized as subcomponents of intelligence 
tests. In the Terman-Merrill (1974), for example, a 10 year old child is tested on his ability 
to repeat six-digit strings in the correct order. Similarly, in the Wechsler Intelligence Scale 
for Children (WISC, 1949) the same age child is tested for his ability to repeat digit strings 
both in their original and reversed order. The sum of forwards and reversed spans 
measured on this test are compared with the norm score of 9 for a child of this age. 

Recently, however, Baddeley et al. (1975) have demonstrated that the immediate memory 
span for short words is greater than that for long words. This effect cannot be solely 
attributed to the number of syllables or phonemes in the stimulus. Rather the effect is truly 
one of word length: even when the number of syllables and phonemes is held constant, the 
memory span for words which take a short time to articulate (e.g. wicket, phallic) is greater 
than that for words which take a long time to articulate (e.g. zygote, coerce). In general the 
span could be predicted on the basis of the number of words which the subject could read 
in approximately 2 s. 

This word-length effect is relevant to the use of memory span as a subcomponent of 
intelligence tests if the following three cases are considered. 

(i) When norms for subject populations of different language are being compared, if the 
articulation time for digits is longer in one language than the other, it is to be expected that 
the average span will be smaller in speakers of the former language. This difference would 
be attributable to this effect rather than to any intellectual differences between the two 
populations. 

(ii) When performance in each of two spoken languages is compared for bilingual 
subjects it again follows that, if digit articulation time differs across these languages, one 
must also expect a difference in span. This implies that this task cannot be used as a 
predictor of relative language competence for bilingual subjects. 

(iii) In the development or modification of intelligence tests for use in different languages 
or dialects, it might seem reasonable to assume, because its information content is 
similar across languages, for the purpose of testing ‘a number is a number, whatever the 
language', and thus simple test translation would suffice. Thus a test normalized and 
written for English subjects might be thought suitable in a translated form for use with 
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subjects of a different language. This is not the case, normalization must again be 
performed whenever a test is modified for new languages or dialects. 
It is argued that the following three experiments demonstrate these points. 


Experiment 1 

Casual observation led to the suggestion that it takes longer to articulate digits in the 
Welsh language (dim, un, dau, tri, pedwar, pump, chwech, saith, wyth, naw) than their 
English equivalents (nought, one, two, three, four, five, six, seven, eight, nine). This was 
therefore tested experimentally. 


Subjects : 
Twelve bilingual subjects in the 20-30 year age range were tested. The minimum criteria for 
bilingualism were: 

(1) Subjects had to have been educated in both Welsh and English and to regularly speak both 


these languages. 

(2) Having assigned performance in his stronger language the score 10, in ranking his ability on 
his secondary language on the arbitrary scale 0-10, the subject was not to assign to this a value of 5 
or less, Thus any person rating himself as having a spoken ability 1n his weaker language less than 
half that in his primary language was excluded from the study. 

Of the 12 subjects participating in this experiment only four considered themselves to be more 
competent in the English language. 


Method 


A sheet of 20 lines each containing the 10 digit numbers 0 to 9 in random order was prepared. The 
subjects were required to read these numbers aloud as fast as possible. They read this list eight times, 
alternating language on each trial. Six subjects performed the first trial in English, six in Welsh. 
Reading times for the 200 digits was recorded on each trial, and a mean time calculated for each 


language. 


Results and discussion 


The resultant data were analysed as a two-factor ANOVA with replications (12 subjects x 2 
languages x 4 replications). Individual differences were seen: the subjects factor was 
significant at the 1 per cent level (F = 73-5, d.f. = 11, 72, P < 0-01). There was a significant 
difference in reading time for the two languages: the mean reading time for the 200 digits in 
Welsh was 77:1 s compared with that in English of 64-2 s (F = 383-0, d.f. = 1, 72, 

P « 0-01). The two-way interaction (F — 21-4, d.f. — 11, 72, P « 0-01) demonstrates that 
individual differences are present re the relative difficulty of the Welsh condition compared 
to the English condition. 

The intial hypothesis was therefore confirmed: even though only one-third of the subjects 
rated themselves more competent in English than in Welsh, every subject read the digits 
faster in English. It took on average 385 ms to read a Welsh digit compared with 321 ms to 
read an English digit. That is, on average, a subject would read six digits in English in the 
time taken to read five in Welsh. 


Experiment 2 


The findings of Expt 1 lead to the prediction that these subjects would have a greater 
immediate memory span for English digits than for Welsh digits, even though the majority 
of them considered themselves more competent in the Welsh language. The same 12 
subjects were therefore tested in Expt 2 to assess this prediction. 

Two translation conditions, where the subject was to report the digits in a language 
other than that in which they were given, were included for two reasons: 
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(1) To investigate if translation is equally easy in both directions: from English to Welsh, 
and Welsh to English. 

(2) To investigate the interaction in working memory between the processing necessary 
for translation and the storage necessary in the span task. This is, in effect, the same 
paradigm as used by Baddeley & Hitch (1974), who investigated the effects of memory 
preloads on the processing involved in verbal reasoning and comprehension. They found 
that with preloads of up to three items there was no significant effect on either speed or 
accuracy on these tasks, whereas six-item preloads caused a reduction in performance 
levels. They suggest from these and other findings that working memory can be usefully 
regarded as consisting of a central processor unit (CPU) which undertakes executive 
functions and which can be used for storage should processing demands be small. In 
addition there is postulated an articulatory loop (AL), the phonemic part of working 
memory, which can store up to about three items. This schema is used to explain the 
preload results — with small preloads storage is primarily AL based, with no resultant 
interaction between storage and CPU processing. Should storage demands exceed the 
capacity of the AL, the remainder must be stored in the CPU resulting in less * work space' 
available for processing functions and thus reduced processing performance. 

Following this line of reasoning, in the present experiment the subject is to remember as 
many items as possible. Thus storage demands will exceed the capacity of the AL, and a 
storage/translation trade is thus predicted. Here, however, the effect of processing 
(translation) on storage is being investigated, as opposed to the effects of storage on 
processing as studied by Baddeley & Hitch (1974) and Hitch & Baddeley (1976). 


Method 


A cassette recorder was used for stimulus presentation. A bilingual person prepared recordings by 
reading strings of digit stimuli at a rate of one per second, with the warning signal ‘ready’ preceding 
every string. 

There were four experimental conditions, two used Welsh digits and two English digits. For each 
condition the stimuli consisted of three trials at each length of string from two to 10 digits, and these 
were presented in ascending order of length. Digits were chosen at random with the restriction that 
no digit could be repeated within any string. 

The presentation order of the four conditions was randomized across subjects. The four conditions 
were: same language condition 1: English stimuli, English response; same language condition 2: 
Welsh stimuli, Welsh response; translation condition 3: English stimuli; Welsh response; and 
translation condition 4: Welsh stimuli, English response. 

The subject was to listen to the strings and, upon a cue to respond, either to repeat the digits he 
had heard in the correct order and the same language, or, in the translation conditions, to report 
those digits he had heard in the correct order but in a different language: if the stimuli were presented 
in Welsh he was to respond in English, and vice versa. Within any condition he continued in this 
fashion until he had made incorrect responses on three consecutive trials. His span was calculated by 
application of the formula: span = 1+ (number of trials correct/3). 


Results and discussion 


The mean spans for the four conditions are shown in Table 1. A two-way ANOVA (12 
subjects x 4 conditions) resulted in both main factors and the interaction being significant 
at the 1 per cent level. Individual differences are again present (F — 18:3, d.f. — 11, 33, 

P < 0-01). The conditions factor (F = 15-4, d.f. = 3, 33, P < 0-01) was broken down using 
a Duncan Multiple Range Test which demonstrates a significant superiority of performance 
in English over that in Welsh (EE > WW, P < 0-01). This is the case even though the 
majority of subjects rated themselves more proficient in the Welsh language, and it is 
suggested in the light of the results of Expt 1 that this differential is the result of Welsh 
digits taking longer to articulate than English digits. 
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Table 1. Effects of language and translation upon mean memory span for the 12 subjects 
of Expt 2 3 








Same language conditions "Translation conditions 

EE WW EW WE 
Language of stimulus English Welsh English Welsh 
Language of response English Welsh Welsh English 
Mean digit span 6:55 57] 511: 5-64 


The application of an a posteriori Scheffé test confirms the interaction between 
translation and storage: the contrast between the same language conditions (EE and WW) 
and the translation conditions (EW and WE) is significant at the 1 per cent level. Thus the 
need to translate from a stimulus of one language to a response of another reduces the item 
storage levels from those attained when storage alone is involved. This storage/processing 
trade is of interest in two lights. Firstly it is further evidence which accords with the 
working memory formalization of Baddeley & Hitch (1976). A more practical 
consideration, however, is that in situations where translation is necessitated, especially in 
the early stages of second language learning, it is predictable that memory span will be 
reduced, and this will result, for example, in lower levels of comprehension of sentences as 
a whole — a task which necessitates the remembering of not only the words constituting the 
sentences but also their relative position. 

The superiority of the English response (EE — EW = 1-44, P < 0-01) is greater than the 
superiority of the English stimulus (EE— WE — 0:91, P « 0:01). This difference is in effect 
the contrast between the EW and WE conditions which are significantly different at the 
0:05 level. This might suggest that whilst both stimulus and response effects contribute to 
the superiority of English span over Welsh span (EE — WW = 0-78, P < 0-01), the main 
contribution is from the English response effects. It should however be noted that this 
finding can also be explained by the suggestion that translation from Welsh to English 
requires more processing than that from English to Welsh. This contrast between the 
decremental storage differences resultant from Welsh to English translation 
(BE — WE = 0-91) and that resultant from English to Welsh translation (WW —EW = 0-66) 
is however insignificant. 

Baddeley et al. (1975) demonstrated that a subject's span could be predicted to be the 
number of words that could be read in approximately 2 s, and in parallel to this found a 
significant correlation between a subject's reading speed and his memory span. Both of 
these findings are confirmed here: 

(i) a Spearman rank-order correlation performed on the digit reading times determined 
in Expt 1 and the spans resultant in Expt 2 yields a significant correlation of —0-47, 

P < 0:05; 

(ii) the mean time taken to read a Welsh digit in Expt 1 was 385 ms, the mean Welsh 
span in Expt 2 was 5-77: these digits could be read in 2-22 s; comparable figures for the 
English language are a digit span of 6-55 items which at a reading rate of 321 ms/digit 
could be read in 2-10 s. 


Experiment 3 


The differential found between the two conditions of Expt 1 are attributable to response 
effects since the numerical stimuli were the same in the two conditions. If, however, digit 
names rather than figures are used as stimuli, 1t is possible to additionally investigate both 
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stimulus effects and the effects of translation demand in a reading task. This latter 
investigation, Expt 3b, will test the suggestion proposed in Expt 2 that, for subjects whose 
preferred language is Welsh, translation to their preferred language (English to Welsh) is 
easier than that in the reverse direction. 

The memory span differential obtained in Expt 2 was again attributed to factors 
operating at the response level. However, it is unclear whether this is the result of language 
differences in word length (and thus the digit names of Welsh are more difficult to 
remember than those of English) or whether it is a result of differential familiarity (the 
Welsh speakers who were tested may have dealt more with numbers in the English 
language). These hypotheses are investigated in Expt 3c. 


Experiment 3a 


Method. Eight new subjects were found who fulfilled the criteria of bilingualism outlined in Expt 1. 
To ensure that this sample was consistent with the earlier one, the subjects were tested using the 
procedure of Expt 1. 


Results. The data, analysed as a two-factor ANOVA with replications, yield a similar 
pattern to that of Expt 1: the subjects factor (F = 12-5, d.f. = 7, 48, P < 0-01), language 
factor (F = 100-33, d.f. = 1, 48, P < 0-01), and two way interaction (F = 6-05, d.f. = 7, 48, 
P « 0-01) were all significant at the ! per cent level. 

The subjects were again slower at reading the 200 digits in Welsh (X = 66-6 s) than in 
English (X = 57-6 s). These figures yield estimates of reading times of 333 ms per Welsh 
digit (cf. 385 ms in Expt 1) and 288 ms per English digit (cf. 321 ms in Expt 1). 

This second sample therefore shows similar bilingual characteristics to the first. 


Experiment 3b 


The procedure of Expts 1 and 3a involves presentation of digit figures (e.g. ‘9’) and either 
English (‘nine’) or Welsh (‘naw’) responses. Thus any difference between the conditions 1s 
attributable to the differing responses. If, however, the digit words are used as stimuli, it is 
possible to investigate stimulus effects, and also to study the effects of translation demand 
in the reading task. 

Method. The eight subjects of Expt 3a were therefore next tested for their overt reading speed of (i) 
200 English digit words with English response, (1i) 200 Welsh digit words with Welsh response, (iii) 
200 English digit words with Welsh response, and (iv) 200 Welsh digit words with English response. 
Conditions (i) and (ii) are same language conditions, (iii) and (iv) are translation conditions. The 
stimuli were again presented as a sheet of 20 lines, each line containing the 10 digit names in a 
random order. All the subjects were tested twice on each condition, the order of presentation of 
conditions being counterbalanced across subjects. Reading times were measured wnth a stopwatch. 


Results and discussion. Mean reading times for the 200 words of each condition are shown 
in Table 2, as are the corresponding item-processing times. 

The data, analysed as a two-factor ANOVA with replications (8 subjects x 4 conditions 
x 2 blocks) demonstrated individual differences of reading speed (F = 9-32, d.f. = 7, 31, 

P < 0-01), a highly significant condition factor (F = 639-7, d.f. = 3, 31, P < 0-01), and a 
significant subject x condition interaction (F = 138-9, d.f. = 21,31, P < 0-01). 

The conditions factor was further analysed using a Duncan Multiple Range Test which 
demonstrated no significant difference between the reading times in the EE and WW 
conditions. This result is in striking contrast to that of Expt 3a where the digit figures were 
read more slowly in Welsh than in English. The response effect operating in Expt 3a which 
biases towards slower responses in Welsh is therefore being counteracted in Expt 3b by a 
factor which differentially facilitates Welsh stimulus processing. The most likely explanation 
of this factor is one of familiarity: the majority of the subjects considered themselves more 
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Table 2. Effects of language and translation upon mean digit word reading speed for the 
8 subjects of Expt 3b 














Same language conditions Translation conditions 
EE WW EW WE 
Language of stimulus English Welsh English Welsh 
Language of response English Welsh Welsh English 
Mean 200 item reading 61:5 58:5 107 3 131-3 
time (s) 
Mean item processing 308 293 537 657 
time (ms) 








proficient in the Welsh language, and therefore it can be assumed that they preferentially 
process Welsh, as opposed to English, words since they are more familiar with them 
(famuliarity/frequency effects in word recognition have long been demonstrated, see e.g. 
Howes & Soloman, 1951; Tulving & Gold, 1963). 

To investigate the effects of translation, the results of Expt 3b were analysed as a 
three-factor ANOVA with replications (8 subjects x 2 condition types (same language vs. 
translation) x 2 response languages x 2 blocks). A highly significant effect of translation 
demand emerges (F = 1771-6, d.f. = 1, 31, P < 0-01): the need to translate results in 
reading times which are almost double those of the same language conditions. As there is 
no significant difference between reading times on the EE and WW conditions, the 
significant condition type x response language interaction (F = 55-5, d.f. = 1, 31, P < 0-01) 
demonstrates that for those subjects who are Welsh dominant translation from Welsh to 
English requires more processing than translation into the preferred language, viz. from 
English to Welsh. 

The EE and WW digit spans obtained in Expt 2 could be predicted as the number of 
digits that could be read in approximately 2 s, using reading rates determined for figural 
stimuli in Expt 1. If the spans of the 12 subjects in Expt 2 are used as indicators of the 
levels to be expected of the eight subjects of this experiment, it can be seen that a similar 
prediction could be made from these subjects’ same language reading rates: for the EE 
span 6-55 digits could be read at a rate of 308 ms/item in 2-0 s; for the WW span 5-77 
digits at a rate of 293 ms/item could be read in 1-7 s. This is not the case, however, with the 
translation conditions; the number of digits read and translated in 2 s as estimated in Expt 
3b yields a severe underestimate of the Translation condition spans of Expt 2: for the EW 
span 5-11 digits at a rate of 537 ms/item could only be read in 2-7 s; for the WE span 5-64 
digits at a rate of 657 ms/item could only be read in 3-7 s. This is not surprising. The 
reading times for the translation conditions can be assumed to be bipartite, one component 
representing the time taken to translate from the language of stimulus to that of response, 
the other the time to articulate this response. Baddeley et al. (1975) suggest that the 
memory span is constant when measured in units of time because the rehearsal system is of 
temporally limited capacity. Thus the more words which can be articulated within this 
temporal limitation, the more words that can be rehearsed, and the greater the span. It is 
the articulation time component of the translation condition reading times which will be 
operative in determining the number of items which can be rehearsed. The translation 
component will limit rehearsal time to a lesser extent: either (i) by delaying the initial 
rehearsal loop loading stage (assuming immediate translation and rehearsal in the language 
of response) or (ii) by delaying response production after rehearsal (assuming rehearsal in 
the language of the stimulus and translation immediately before response production). The 
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amount that can be read in 2 s at a speed derived from the translation condition reading 
times therefore yields an underestimate of span, the articulation time component of this 
measure will produce a closer estimate, although this will itself be an overestimate. 


Experiment 3c 


Digit span measured in the Welsh Language is smaller than that measured in English. It is 
not possible to conclude, however, that this is necessarily an effect of word length: both the 
span and reading rate differences might be attributable either to word-length differentials 
or to differences in degree of familiarity. This latter possibility must be considered as it 
seems that Welsh speakers do on occasion preferentially use English number names. For 
example, the present year is sometimes referred to as ‘nineteen seventy-eight’ in preference 
to *mil naw saith wyth' or the more clumsy *un mil naw cant saith deg wyth'. It is thus 
possible that numbers are a special case of language usage, and therefore the language 
competence self-ratings obtained for our bilingual subjects may not represent their language 
of preference when dealing with numbers. 

Effects of word length and familiarity can be distinguished if articulatory suppression is 
used as an interference task. The word-length effect, which Baddeley et al. (1975) attribute 
to the functioning of the articulatory loop, is much reduced with visual stimulus 
presentation if the subject is undergoing articulatory suppression (repeatedly whispering the 
sequence 1-2-3. . .8). Therefore if the difference between English and Welsh digit spans is a 
result of the differential articulation time of the digit names, i.e. if it is a word-length effect, 
this difference should be either absent under articulatory suppression, or, if present, present 
in a much reduced form. 


Method The eight subjects of Expt 3a were again tested here. The EE and WW conditions of Expt 2 
were modified to allow for visual presentation and the interference task. Thus the digits constituting 
the strings were presented sequentially on a memory drum at a rate of one item per second. To 
ensure that the stimuli were processed in the required language, digit words were presented, e.g. 
‘pedwar’ or ‘four’, as opposed to the digit figures. The subjects were required, as in the same 
language conditions of Expt 2, to report the component digits of the strings in the correct order at 
the end of string presentation. The major difference between this procedure and that of Expt 2 was 
that throughout the period of digit string presentation the subject was to whisper the sequence 
‘a-b-c-d’ in a continuous cycle at the fastest rate compatible with clarity of pronunciation. The 
subjects were tested on both conditions with order of presentation counterbalanced. 


Results and discussion. The mean digit spans for the WW and EE conditions were 3:75 and 
4-00 respectively; they are not significantly different (t = 1-40, d.f. = 7). These figures are to 
be compared with those of WW span 5-77 and EE span 6:55 for the 12 subjects of Expt 2 
where no suppression was used and stimulus presentation was auditory. 

It must therefore be concluded that the bilingual digit span differential is a word-length 
effect. Even for subjects who consider themselves more proficient in Welsh, the structure of 
the Welsh digit names necessitates that it is easier to remember lists of numbers in English. 
This effect, albeit relatively small (the English span being 114 per cent that of the Welsh 
span) must be assumed to be operative in everyday situations such as the short-term 
remembering of telephone numbers. 

This finding also leads to a prediction of greater importance. Hunter (1957) stressed that 
short-term memory plays an important part in calculation. Using a multiplication problem 
as an example, he suggests that short-term memory is involved in at least two ways: (a) as 
problems of this nature are solved through a succession of stages, as he proceeds from one 
stage to the next, the subject ‘needs to keep remembering which particular stage he is at in 
the calculation as a whole. He needs to remember what the original problem was, and must 
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not lose his way, as he progresses from one stage of working to the next.' (b) Often he has 
also to remember the outcome of previous stages. 

Thus any factors which limit short-term storage will make mental arithmetic more 
difficult. These factors can be organismic, for example the limitations in the short-term 
memory of dyslexic children (dyslexia is commonly associated with problems in arithmetic), 
or linguistic. It has been shown that Welsh digits are, as a result of their word length, more 
difficult to remember in working memory than their English equivalents. It must therefore 
be proposed that, for subjects with equal practice in both languages, mental arithmetic 
carried out in Welsh will be more difficult than in English. 


General discussion 


It has been demonstrated that cross-lingual differences in word length may result in 
different magnitudes of digit span as measured in those languages. For this reason digit 
span norms cannot be compared across languages as an indicator of cultural intellectual 
differences. 

Wiliam & Roberts (1972) developed a Welsh Children's Intelligence Scale (WCIS) by 
modifying and translating the Wechsler Intelligence Scale for Children (WISC). The WISC 
was experimentally adjusted to Welsh idioms and to Welsh customs, and norms were 
statistically deduced from extensive trials with Welsh-speaking children taught in Welsh 
schools. The digit span subtest of the WCIS, was in effect, a direct translation of that of the 
WISC, the same digit strings are used. If the norms on this test are compared to those of 
the original WISC (see Table 3 where the digit span figures represent the sum of digit span 
forwards and digit span reversed) it can be seen that the norms for the Welsh sample are 
reliably less than those of the American sample. It is proposed that these findings cannot 
be taken to imply intellectual differences between the two populations, rather they are the 
result of the differing languages, English digits being easier to remember than Welsh digits. 

The effect of a language's number-name word length upon number memorability has 
been demonstrated here for Welsh and English, and it has been proposed that 
number-name word length will also affect the ease of mental calculation. There is no reason 
to doubt that this effect also operates in other languages. It is therefore suggested that 
languages will be more or less conducive to number memorability and 
manipulation/calculation, and that these language differences will be dependent upon a 
seemingly unlikely factor, viz. the word length of the languages’ number names. A useful 
area of further experimentation is therefore a survey of the word length of languages' 
number names. 

The following conclusions are drawn: 

(1) There is a translation/storage interaction in working memory which results in a 
reduced memory span when translation is necessitated. 

(2) For bilingual subjects who consider themselves stronger in Welsh, translation from 
English to Welsh is easier than that in the reverse direction. 


Table 3. Digit span scores (sum of digit spans forwards and reversed) for the American 
population tested in English on the WISC procedure, and the Welsh population tested in 
Welsh on the WCIS translation of WISC digit span procedure 


Subject age 
at test (years) 610 710 810 910 10:10 11:10 1210 13-10 1410 15:10 





WISC digit span score 7 


8 8 9 9 10 10 10 11 11 
WCIS digit span score 7 7 8 8 8 


9 9 9 9 10 
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(3) The development or modification of intelligence tests for use with different languages 
or dialects must be accompanied by renormalization. As Burt (1939) stated in reference to 
the use of the WISC in England: testers in England ‘should be supplied with a 
standardised procedure and with standardised norms — a procedure which has been 
experimentally adjusted to English idioms and to English customs, norms which have been 
statistically deduced from extensive trials with English children, trained in English homes, 
and taught in English schools.’ It follows from this that norms for different adaptations of 
an intelligence test should not be directly compared with an aim to deducing intellectual 
differences between the populations from which these norms were derived. 

(4) English digit names can be articulated faster than their Welsh equivalents. 

(5) For bilingual subjects, even those who consider themselves to be more competent in 
Welsh than in English, digit span measured in the Welsh language is reliably less than 
measured in English. This finding is an effect of word length, and leads to the suggestion 
that any operation which involves remembering numbers (anything from mental arithmetic 
to the remembering of telephone numbers) will be more difficult to perform in the Welsh 


language than in the English language. 
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Intelligence and semantic judgement time 


P. H. K. Seymour and W. L. N. Moir 





An information-processing analysis of intelligence proposed by Eysenck and Furneaux was examined 
in an experimental study of semantic categorization and free recall by 11 year old children of varying 
intelligence. Although the general level of the reaction time 1n the categorization task was inversely 
related to intelligence, predicted relationships with error frequency and the slope of the memory 
search function were not obtained. The subjects’ continuance in a free recall task was also unrelated 
to their intelligence. Some implications of these results are discussed. 





Some years ago Eysenck (1953, 1967) outlined an information-processing analysis of 
intelligence which appeared to hold out prospects of a beneficial integration of the 
psychometric and experimental traditions in psychology. His account, which derived from 
the work of Furneaux (1960), was based on the assumption that success in problem-solving 
is dependent on the speed and accuracy of a central process which generates and evaluates 
trial solutions, and on the delay tolerated before the search is abandoned as a failure. 

Eysenck (1967) presented two graphs which purported to describe a relation between 
intelligence and reaction time in problem-solving. The first of these illustrated Furneaux's 
finding that reaction time for correct responses increases as an exponential function of a 
normatively defined index of problem difficulty. Individual differences in intelligence affect 
the intercept of this function, but not its slope. The second showed the well-known 
relationship between choice reaction time and the number of stimulus-response alternatives 
in a choice task. The reaction time increases as a linear function of the log, of the number 
of alternatives (Hick, 1952), and the slope of this function may be treated as an index of an 
individual's rate of information processing. Eysenck cited evidence to show that the slope 
index was inversely related to intelligence, but that the intercept (which he took to reflect a 
simple reaction in which no choice was involved) was unrelated to intelligence. 

When taken together these statements imply that intelligence affects the time required to 
perform a normatively easy task (such as the making of a choice reaction), and that this 
effect is primarily due to variations in the capacity of a central process to distinguish 
among alternatives. In the Eysenck-Furneaux model the central process 1s viewed as a 
TOTE unit (after the manner of Miller et al., 1960) which performs a succession of retrieval 
operations (the generation of trial solutions) and comparison operations (the matching of 
the solutions against a criterion defining acceptable outcomes). Hence, the model asserts 
that the rate and accuracy of functioning of a retrieve-and-compare process represents an 
Important cognitive contribution to individual variation in tested intelligence. 

‘A semantic version of the memory search task introduced by Sternberg (1966, 1967) 
provides an experimental paradigm which may be used to test the validity of this proposal. 
In the memory search task the subject holds a small set of items in memory and classifies 
probe items positively if they are members of the set, and negatively if they are not. 
Numerous experimental studies have shown that the positive and negative reaction times 
increase as a function of memory set size, although there has been dispute as to whether the 
function 1s linear or logarithmic (Briggs, 1974). It has been usual to represent the relation 
between the reaction time (RT) and the memory set size by statements of the form: 

RT = A+ B(M) ms, or RT = A+ B(H,) ms, where A is a zero intercept parameter, B isa 
slope parameter, M is the size of the memory set, and H, is a logarithmic transformation of 
M (Sternberg, 1967; Briggs & Johnsen, 1972). Sternberg (1969, 1975) has presented 
substantial evidence to support the view that memory set size is a factor exerting a selective 
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influence on the duration of a comparison stage whose rate of functioning is indexed by the 
slope parameter, B. Briggs has outlined additional arguments to suggest that the 
comparison operation may be divisible into component processes of retrieval of memory set 
items and of matching these against probe items (Briggs & Blaha, 1969; Briggs & Swanson, 
1970; Briggs & Johnsen, 1972). In the analyses of both Sternberg and Briggs the intercept 
parameter, A, is taken to be an index of processes which precede and follow the 
comparison stage, that is the encoding of the probe, and the selection and initiation of a 
Yes or No response. 

There have been some previous investigations of the relationship between intelligence 
and the A and B parameters of the function relating RT to M. In studies in which IQ is 
held constant while chronological age and mental age covary it is typically found that 
mental maturity influences the intercept parameter, A, but not the slope parameter, B 
(Hoving et al., 1970; Maisto & Baumeister, 1975). On the other hand, an effect on B was 
found when 16 year old mental retardates were contrasted with normal children who 
matched them in chronological age or in mental age (approx. 84 years) (Harris & Fleer, 
1974). These results suggest that the effect on B predicted by the Eysenck-Furneaux model 
occurs only in cases of very low IQ, and that it is not characteristic of the intellectual 
variation sampled when children of average IQ but differing maturity are tested. 

The experiment to be reported extends the previous research ın two main respects. 
Firstly, we noted that the studies cited employed symbol processing versions of the memory 
search task. It seemed arguable that the retrieve-and-compare function postulated by the 
Eysenck-Furneaux model must be capable of non-literal semantic comparisons between 
representations of potential solutions and a criterion defining a class of acceptable 
solutions, and that it was therefore appropriate to use a semantic version of the task in 
testing its assumptions. Secondly, we used a design in which chronological age was held 
constant while IQ (and hence mental age) was varied over the range encountered within the 
normal school population (approximately 70-130 IQ points in our sample). 

A group of 10 year old children of varying IQ took part in a memory search experiment 
in which the memory sets consisted of one, two or three category names, and the probe 
items were names of class instances. The instruction was to respond positively if the probe 
named a member of one of the categories in memory, and negatively if it named a member 
of any other category. À previous investigation of this task by Juola & Atkinson (1971) 
suggested that the introduction of a semantic criterion of set membership increases the 
complexity of the processing occurring during the comparison stage, producing what has 
been called a ‘translation effect’ (Cruse & Clifton, 1973). A contrast between a lexical 
condition, in which subjects decided whether or not probe words were included in a 
memorized list of words, and a semantic condition involving decisions about category 
membership, showed that the main effect of the conditions was on the slope of the function 
relating RT to M. For negative trials, the results for the lexical condition were described by 
the statement: RT = 617+ 26(M) ms, as against: RT = 653+111(M) ms for the category 
condition. 

There is good reason to argue, therefore, that the semantic version of the memory search 
task implicates a complex central activity, involving the retrieval and comparison of 
semantic information, whose rate of functioning is indexed by the slope parameter, B, of 
the function relating RT to memory set size. This central activity seems qualitatively similar 
to the scanning and checking process postulated as a component of intelligence by Eysenck 
and Furneaux. Experiment 1 was conducted with the aim of determining whether the 
inverse relation between intelligence and B predicted by the model would be obtained. 
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Experiment 1 
Method 


Subjects. The subjects were 36 children (19 boys and 17 girls) from a Primary 6 class of a Dundee 
school. They were selected from a larger group of 70 children who had sat the Moray House Verbal 
Reasoning Test 93 about 6 months before the experiment was conducted. The results of this test 
correlated highly with those of another Moray House test administered | year earlier (rho = 0 94, 

P « 0:001), suggesting that the intelligence measure was highly reliable. The children chosen for 
testing were assigned to six groups of six subjects, such that the average intelligence quotient (VRQ) 
was 124 in Group 1, 116 in Group 2, 108 in Group 3, 98 in Group 4, 88 in Group 5 and 76 in Group 
6. The range of VRQ scores was 9 points in Group 6 (71-80), 7 points in Group 1 (121-128), 6 
points in Group 2 (112-118), and 3 or fewer points in the remaining groups. Group 6 contained four 
boys and two girls. Otherwise, each group contained three boys and three girls 


Materials. The experimental materials consisted of six category names (sports, weapons, four-footed 
animals, furniture, parts of the body, male first names), which were used to form the memory sets. 
These categories all appear in the normative tables of Battig & Montague (1969), and have been 
shown to be relatively distant from one another semantically (Herrman et al., 1975). A selection of 12 
exemplars of each category was made, using the Battig & Montague tables. Since the dominance of 
items in their categories is an important factor 1n semantic categorization tasks (Wilkins, 1971; 
McFarland et al., 1974), every effort was made to hold the average of the ranks of the production 
frequencies of the items constant across the categories. Word length was also equated 1n so far as this 
was possible, with the words averaging about 4 letters in each category except animals (3:7 letters) 
and sports (6-7 letters). These items were used to form a set of positive probes in the classification 
task. The set of negative probes was formed by selecting six exemplars of each member of a further 
group of six categories taken from the Battig & Montague tables (musical instruments, parts of a 
building, fruit, clothing, metals and relatives). The factors of word length and dominance were again 
equated across categories, since there is evidence that the dominance of an item in its own category 
may affect negative reactions in categorization tasks (McFarland et al., 1974; Millward et al., 1975). 


Apparatus. Memory sets, consisting of one, two or three category names, were presented to the 
subjects on cards for study prior to the trial sequence in which a particular memory set size was being 
tested. The probes were presented by means of a VR 14 display oscilloscope slaved to a PDP 12 
laboratory computer. The computer was programmed to display the probe words in a randomized 
sequence which was specified in advance of the test session. The words appeared in upper case letters 
at the centre of the display and remained in view until the subject made a vocal ‘Yes’ or ‘No’ 
response. This event was signalled to the computer via a voice-activated relay. The computer timed 
the reactions to the nearest millisecond, and printed a summary of the data at the end of each test 
session. 


Design. The experiment was based on a three-factor design, with the factor of intelligence (six levels) 
placed between subjects, and the factors of memory set size (one, two or three category names) and 
response (‘ Yes’ versus ‘No’) placed within subjects. Reaction time and error rate were the principal 
dependent measures. 


Procedure. Each subject was tested on four blocks of trials. The first of these consisted of 12 practice 
trials and involved presentation of two category names (vehicles and birds) and positive and negative 
examples which were not used in the main experiment. There then followed three blocks of 28 trials 
A new memory set was specified before each block, consisting of one, two or three category names. 
When the subject indicated that he or she was ready the randomized sequence of probe words was 
presented. The first four trials of the block were treated as practice, and did not feature in the 
subsequent analyses. The remaining 24 items consisted of 12 positive probes and 12 negative probes 
The experiment utilized a fixed set procedure, the memory set being held constant with respect to 
size and member categories throughout each trial block. Each of the six possible orderings of the 
three memory set sizes (one, two or three categories) was assigned to a different member of each of 
the six ability groups. The use of the category names was balanced over the experiment as a whole 
such that each category occurred six times in memory sets of M = 1, six times at each of two 
positions in memory sets of M — 2, and six times at each of three positions in memory sets of M — 3. 
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Within ability groups, a given category appeared once in the M — 1 condition, once at each position 
in the M = 2 condition, and once at each position in the M = 3 condition. The procedure also 
balanced the occurrence of categories across the three trial blocks. The sampling of positive probes in 
the M — 2 and M — 3 conditions was done in a manner designed to represent all 12 positions in the 
Battig & Montague ranking of the dominance values of the items. Over the experiment as a whole 
each positive ttem occurred in the sequences of 18 subjects, and each negative item occurred in the 
trial sequence of all 36 subjects. The trial sequence presented to an individual subject therefore 
consisted of 72 critical trials, distributed over three memory set sizes and two response categories, 
each involving a single presentation of a probe word. 


Results 

Errors. The obtained error percentages have been summarized in Table 1. It can be seen 
that error frequency tended to increase as memory set size increased, and that incorrect 
negative responses to positive probes were generally more frequent than incorrect positive 
responses to negative probes. An analysis of variance of the error scores (following 
application of an arcsin transformation) confirmed these conclusions, with F = 5-2, 

d.f. = 2,60, P « 0-01 for memory set size, and F = 27-96, d.f. = 1,30, P < 0-001 for 
response class. However, there was no evidence of a trend in error scores related to the 
ability groups, (F « 1, d.f. = 5,30). The rank-order correlation between the individual 
intelligence test scores and error scores ın categorization was computed but was found not 
to be significant (rho = — 0-05). 


Table 1. Reaction time (ms) and error proportions (in brackets) together with slope and 
intercept parameters for children of varying intelligence in the semantic categorization task 


(Expt 1) 


Memory set size 





Slope Intercept 
Ability Mean 1 2 3 — 
group VRQ Yes No Yes No Yes No X Yes No Yes No 
1 124 1067 1169 1013 1286 1118 1366 1170 26 98 1015 1077 
Q8 (28) (8:3) (569 (9-7) (56 
2 116 913 1041 985 1124 1058 1352 1079 72 156 840 861 
(11-1) (0) (9-7 (14) (125) (56) 
3 108 927 1059 1046 1204 1037 1330 1100 55 136 893 927 
GLD (14 (97) Q8 (53) (8-5) 
4 98 1049 1305 1109 1396 1186 1488 1256 68 92 978 1213 
(2:8) (14) (56) (69) (125) (42) 
5 88 1189 1526 1273 1689 1307 2247 1538 59 360 1138 1100 
(69) (28) (83) (69) (56 (83) 
6 76 1327 1923 1544 2478 1506 2163 1824 90 120 1280 1948 
(69) (t4) (153) (8:3) (167) (0) 
X 1079 1337 1162 1529 1202 1658 1328 62 160 1024 1188 


These results indicate that error frequencies in categorization are related to the difficulty 
of the task (in so far as an increase in memory set size may be viewed as an increase in 
difficulty). However, a subject's liability to such errors seems to be unrelated to his 
performance on the verbal reasoning test. 


Reaction times. Mean reaction times were calculated for each subject for positive and 
negative responses at each memory set size. The error reactions were excluded from this 


Intelligence and semantic judgement time — 57 


calculation, together with a few trials on which the voice key had failed to operate. A 
summary of the RTs is given in Table 1. 

Inspection of the reaction time data will suggest that response times tended to increase as 
ability level became lower, that positive responses were generally made faster than negative 
responses, and that reaction time increased as memory set size increased, the rate of 
increase being somewhat greater for negative than for positive responses. An analysis of 
variance was conducted with the aim of testing the reliability of these trends. The analysis 
confirmed that there were differences among ability groups (F = 3-71, d.f. = 5,30, 

P « 0-01), that positive responses were faster than negative responses (F — 37-15, 

d.f. = 1,30, P « 0-001), and that reaction time increased as a function of memory set size 
(F = 18:65, d.f. = 2,60, P < 0-001). There was in addition a significant interaction of 
memory set size and response (F — 5:81, d.f. — 2,60, P « 0-01). These results were further 
qualified by an interaction between memory set size and ability groups, and a three-way 
groups x memory set size x response interaction. 

A Newman-Keulls test was applied with the aim of clarifying the effect of ability level on 
the overall reaction time. The test indicated that the effect was primarily due to Group 6 
whose responses were slower than those of any other group. Individual mean reaction 
times were correlated with intelligence test scores and a moderate relationship was found 
(rho = —0-47 for intelligence and positive RT, and rho = —0-53 for intelligence and 
negative RT, P « 0-01 in each case). The positive RT was highly correlated with the 
negative RT (rho — 4-0-93, P « 0-01). These findings are consistent with Eysenck's 
contention that individuals of varying intelligence will differ in the time they take to 
perform simple tasks which involve an element of choice. 

The hypothesis that the speed of functioning of a central comparison process is 
responsible for this effect can be evaluated by examining the relationship between ability 
level and the slope of the functions relating RT to memory set size. Table 1 includes the 
average slope and intercept values obtained by fitting straight lines (by application of the 
least squares procedure) to the data of the individual subjects and averaging these scores 
within the groups. The data are somewhat variable, and do not show evidence of a 
monotonic relation between intelligence and slope value. Inspection of the individual results 
revealed some instances of negative slope values for positive RTs (6 subjects in all) and 
negative RTs (2 subjects in all), and this precluded the use of analysis of variance as a test 
for a relationship between intelligence and slope value. The non-parametric Mann-Whitney 
U test was employed instead, but no significant contrasts were obtained. Rank order 
correlations between individual slope values and intelligence test scores were computed, but 
were found to be non-significant (rho = —0-06 for positive slope values, and rho = —0-01 
for negative slope values). It seems, therefore, that our data fail to substantiate the 
hypothesis that the effect of intelligence on reaction time may be localized in a central 
comparison stage of processing. 

An alternative hypothesis is that the effect is localized 1n the encoding and response 
stages which precede and follow comparison. According to Sternberg's analysis the 
` durations of these input and output processes are contained in the zero intercept 
parameter, A, of the equation relating RT to memory set size. It can be seen from Table 1 
that the intercept values tended to increase as ability level decreased, although the results 
for the most intelligent children are out of line with this trend. An analysis of variance 
confirmed that the groups effect was significant (F — 2-84, d.f. — 5,30, P « 0-05), and a 
correlational analysis indicated the existence of a moderate relation between intelligence 
and positive intercept value (rho = — 0-37, P < 0-05) and between intelligence and negative 
intercept value (rho = — 0-39, P < 0-05). Thus, although no distinction between the input 
and output stages can be made on the basis of these results, the data suggest that 
variations in intelligence may influence one or both of these processes. 
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Experiment 2 


A second experiment was conducted with the aim of evaluating the relation between 
intelligence and Furneaux’s concept of continuance. This was defined by Furneaux (1960) in 
terms of an internal timing process which signals when the effort to solve a problem may 
terminate in failure. This was represented by Eysenck (1967) as an important non-cognitive 
determinant of test performance. It was considered that continuance could be assessed in 
the context of the categorization task by asking subjects to recall as many of the probe 
words as they could immediately following completion of the main test. This was a 
complex task on which all subjects could achieve a partial but incomplete success. It 
seemed likely that persistent subjects would continue their attempt at retrieval when faced 
with failures to find new items, and that this tendency would be indexed by the occurrence 
of long pauses toward the end of the recall sequence, and in particular before telling the 
experimenter that they could remember no more. 


Method 

Immediately following completion of the categonzation task the subject was taken to another part of 
the laboratory and asked to report as many of the words he had seen displayed on the screen as he 
could, speaking into a microphone attached to a cassette tape-recorder, and to tell the experimenter 
when he was sure he could think no more by saying the word ‘stop’. 


Results 


The absolute level of incidental recall achieved is not of immediate interest in the present 
context. However, various analyses were carried out on the recall data, and these indicated: 
(1) level of recall was not consistently related to intelligence; (2) degree of clustering in free 
recall was not related to intelligence; and (3) recall of a word was not affected by the 
memory set size condition in which it has been presented. There were, on the other hand, 
significant tendencies for positive probes to be better recalled than negative probes (in a 
ratio of about 2:1), and for words occurring early or late in the experimental sequence to 
be better recalled than those occurring around the middle (primacy and recency effects). 

In order to evaluate the continuance hypothesis a temporal analysis of the tape-recorded 
recall protocols was undertaken. The PDP 12 computer was programmed to time a 
sequence of ‘on’ and ‘off’ states of a switch which the experimenter closed during 
production of words while listening to the tape-recordings. The computer then provided a 
record of speech times and intervening pauses for the whole period between instructing the 
subject to start recall and the subject’s report that he had finished. 

A number of potentially useful indices of persistence were derived from the pausing data, 
including: (1) the duration of the final pause between utterance of the last word in the 
subject’s recall and his saying ‘stop’; (2) the longest pause in the subject’s recall sequence; 
(3) a mean of the three longest pauses, the six longest pauses, or all of the pauses. The 
pausing measures reflecting delays occurring during list production were all highly 
correlated (average rho = +0-93). The correlation of the duration of the final pause with 
these pause measures was slightly lower, but nonetheless highly significant (rho = +0-62). 
The within-list pausing measures correlated with overall free recall score (rho = +0-5), but 
the duration of the final pause did not. None of the persistence measures was found to 
correlate significantly with intelligence, or with any of the performance indices derived from 
the analysis of the categorization task. 


General discussion 


The two experiments were carried out with the aim of examining some implications of the 
Eysenck-Furneaux model of intelligence and of exploring the potential for a fruitful 
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merging of the concerns of experimental and differential psychology. Had the predication of 
the model been upheld, we would have expected to find that error frequency in the 
categorization task and the slope of the function relating RT to M were inversely related to 
intelligence, and that length of pause in the later stages of free recall was positively related 
to intelligence. As has been shown, none of these predictions was supported. This failure 
has implications for the model, and also for the relevance of information processing 
concepts to the analysis of human intelligence. 

We argued in the introduction that a contact between concepts of information processing 
and intelligence can be made by considering normatively easy tasks which provide reaction 
time data which can be related to particular component processes or stages. Eysenck’s 
(1967) description of the relationship between reaction time and problem difficulty predicts 
that there should be an effect of intelligence on the time required to perform an easy task, 
such as memory search or categorization. Our data gave some support to this idea, since 
we found that the overall RT in the categorization task was negatively related to 
intelligence. 

The semantic version of the memory search task was used because the possibility of 
division of the RT into slope and intercept parameters allowed us to examine the 
processing assumptions underlying the Eysenck-Furneaux model. The suggestion that 
differences in intelligence depend on the speed and accuracy of functioning of a central 
retrieval-and-comparison stage of processing was not supported by the data. Frequency of 
error in the categorization experiment was unrelated to intelligence, and we were unable to 
show that the slope parmater, B, was inversely related to intelligence. On the other hand, 
there was an effect of intelligence on the intercept parameter, A, which could reflect 
differences in word recognition processes, or in retrieval and production of a vocal 
response. An effect on the encoding process would be consistent with the known 
relationship between intelligence and reading ability (Vernon, 1971), and between reading 
ability and RT in categorization tasks (Perfetti & Lesgold, 1977). We may also note that 
the encoding and response stages which contribute to the value of 4 both involve an 
element of choice, among the members of a set of possible probes during encoding and 
between the binary ‘Yes’ and ‘No’ categories during response selection. In his later work 
Briggs argued that these stages could be analysed in information theoretic terms (Briggs & 
Swanson, 1970), i.e. in the same fashion as the choice reaction task discussed by Eysenck 
(1967). Hence, our results could be said to be consistent with Eysenck's proposals 
regarding a relationship between intelligence and speed of choice reaction, but not with our 
extrapolation from qualitative features of the model to predictions regarding effects on the 
central comparison stage. 

These conclusions are critically dependent on the validity of Sternberg's assumptions 
concerning the significance of the slope and intercept parameters of the RT function. There 
were a few subjects in our sample who produced negative values of B. The intercept values 
for these subjects were extremely high, and inspection of the data suggested the existence of 
a general tendency to trade a high intercept for a shallow slope, or vice versa. A 
correlational analysis confirmed this, since the individual 4 and B scores were negatively 
correlated, with rho = — 0-61 for positive responses, and rho = — 0-47 for negative 
responses (P « 0-01 in each case). In the semantic version of the memory search task it is in 
principle open to the subject to transform the probe to a category name before examining 
the memory set, or to encode to probe lexically and to consider the semantic relation with 
the memorized categories during the comparison stage. As was mentioned above, the 
experiments by Cruse & Clifton (1973) and Juola & Atkinson (1971) suggested that adult 
subjects normally favour the second of these strategies. The existence of a negative 
correlation between the slope and intercept parameters indicates that a proportion of our 
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subjects opted for the first strategy, and it may be that individual subjects switched back 
and forth between the strategies during the experiment. Inspection of the data provided no 
evidence that preference for one strategy or the other was related to intelligence in any 
consistent way. It seems clear that the existence of a negative relationship between the slope 
and intercept parameter values invalidates Sternberg’s additive factor method as a tool for 
localizing effects of mental maturity within either the comparison or the encoding-response 
processing stages, and it would be desirable practice to report the correlation between these 
parameters in future studies of this kind. 

Strategic considerations were probably also important in the measurements of 
continuance undertaken in Expt 2. The continuation or abandonment of an effort at free 
recall may be viewed as part of a cognitive strategy which takes account of data of various 
types, including the subject’s evaluation of his own competence, of the importance of the 
task, and of other costs and benefits which may be associated with success or failure. This 
suggests that continuance measures may be task- and situation-specific, and that they will 
not necessarily reflect a stable personality characteristic of the type postulated by Eysenck 
(1967). 

It might be argued that the intrusion of these strategic considerations into the discussion 
of the categorization and free recall tasks has served only to emphasize the inadequacies of 
these particular experimental procedures as techniques for testing the assumptions of the 
Eysenck-Furneaux model. However, an alternative conclusion might be that a strategic or 
executive processing level is essential in the description of any complex cognitive task, 
whether categorization, free recall or solving problems in Moray House verbal reasoning 
tests. The strategic level will call on subordinate routines to perform such basic functions as 
encoding, retrieval and comparison, and these functions may be isolated and studied by a 
proper application of the additive factor method of Sternberg (1969). Nonetheless, it seems 
likely that qualitative variations in the executive programmes which schedule and call the 
subordinate routines are more important as cognitive determinants of intelligence than 
differences in the speeds of functioning of the individual routines. 

If this position is adopted, the analogy between intelligent problem-solving and the 
making of a choice reaction which 1s fundamental to the Eysenck-Furneaux model may be 
rejected as being inappropriate in its emphasis. The model has been formulated in terms of 
information-processing concepts which are descriptive of subordinate routines rather than 
of the controlling executive. An experimental methodology of the type employed here may 
certainly be used to investigate characteristics of these routines. However, the experimental 
method is in general poorly adapted to the study of the strategic aspects of information 
processing, especially where these become complex. Thus, the model and the experimental 
investigations suggested by it are unlikely to lead to a significant clarification of concepts of 
human intelligence. This requires theories which deal directly with the executive level of 
processing such as those currently emerging from research in 'artificial intelligence' (Newell 
& Simon, 1972). 

To emphasize this point we can consider certain contrasts between an experimental task, 
such as categorization, and the more demanding tasks which are used in intelligence tests. 
The categorization task is one which is well understood by the subject, 1n so far as the 
instructions and general procedure specify the mental activity which must be carried out, 
and the data which must be operated upon. The Eysenck-Furneaux model suggests an 
analogy between a memory set (logically a disjunction of sets) and a criterion defining an 
acceptable problem solution, and also between a probe item and a potential problem 
solution. However, in real problem situations the criterion and potential solutions are not 
specified, but must be retrieved from memory by application of appropriate procedures. 
The difference between children of high and low intelligence presumably lies in their 
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capacity to construct these elements as much as in their ability to determine whether or not 
they correspond. In other words, variation in intelligence may have more to do with 
capacity to determine what should be compared than with the speed with which the 


comparison can be made. 
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The significance for intelligence of differences in birth-weight and health 
within monozygotic twin pairs 


R. W. Marsh 





This paper presents further evidence to demonstrate the existence of intra-uterine effects within the 
normal range of intelligence. The argument is then extended further to estimate the effects of organic 
factors in the environment that are also pathogenic for intelligence. The combination of these and 
intra-uterine effects is found to be a substantial part of the variance associated with environmental 
factors. Various implications of these factors are discussed. 





At least three studies have purported to show evidence which supports the idea that 
differences in birth-weight of monozygotic twins provides a basis for demonstrating 
significant intra-uterine effects on intelligence within the normal range (Babson et al., 1964; 
Willerman & Churchill, 1967; Scarr, 1969). Principal among the critics of this claim is 
Kamin (1974). The most important of his criticisms on this subject include: 

(1) a lack of control of possible examiner bias during the assessment of intelligence; 

(2) the abnormality of the birth-weights which actually contributed the significant variance 
within the sample; 

(3) the rather small numbers in the studies; 

(4) the uncertainty of monozygosity in some cases; 

(5) the small differences in the scores overall between heavier and lighter twins relative to 
the natural spread of scores within the population; 

(6) the lack of control for the effects of adverse perinatal factors. 

Points (3) and (5) may be accommodated simply by using larger samples. This applies to 
(6) also since ultimately one would expect complications such as breech birth to become 
spread more or less evenly between the heavier and lighter groups as the numbers of pairs 
increased. 

Allowance for these considerations and others was made in the design of a study carried 
out to investigate the problem further. 


The survey 


The author canvassed among members of a sample of twins being studied genetically at the 
University of New South Wales. From a total of 82 monozygotic pairs, 46 pairs 
volunteered to participate in the study. In each case monozygosity had been confirmed by 
an exhaustive analysis of blood groups. Health histories were obtained by the author 
which included both antenatal and perinatal details, and scores were taken on the 

Raven’s Progressive Matrices test (Sets ABCDE). The use of this test reduced to a 
minimum possible bias in administration and marking. All subjects were literate, with an 
age range of 16-68 years. All were of normal intelligence with an IQ range of 104—135, and 
a mean of about 125. (This high mean score probably reflects factors associated with the 
voluntary nature of participation in both the parent group and this particular sample of it, 
and also recent results indicate that there has been a substantial inflation in Raven tests 
results during the last 30 years; Kyle, 1977.) Birth-weights ranged from 1362 g to 4086 g, 
with a mean of 2415 g and an average deviation within pairs of 137 g. In 24 pairs the 
heavier at birth had the higher intelligence, in two pairs the birth-weights and the IQs were 
the same, in three pairs the birth-weights were the same but the intelligence was different 
and in 16 pairs the lighter person had the higher IQ. These results showed a significant 
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difference in favour of the heavier group (¢ = 1:68, P = 0-05) 2-5 per cent of the total 
variance being associated with this effect, in this sample. (This may be obtained either 
through use of the analysis of variance, or by using the formula 1 — (rmzt/"mztp) where ryz, 
is the intra-class correlation, and ryztp is the correlation obtained after assigning the twins 
to groups according to whether they were the lighter or the heavier in their pair.) It is 
difficult to give an exact translation of this result in IQ terms, because of problems 
associated with the conversion of Raven raw scores to IQs. These problems include the 
non-uniform effect of the inflation referred to above to the representation in the norms of a 
large span of IQs for the same raw score, and also to the uncertainty about allowances for 
age factors even in an adult population because of the tendency for decline in older groups. 

Many of these problems clearly do not apply in the same way to raw scores used for 
comparison within twin pairs. Thus the mean of the differences within twin pairs, which is 
the same in this case as the differences between group means, is almost one raw score 
point. Entering the norm tables at the place of the combined mean for the heavier and 
lighter groups indicates that this amounts to about three IQ points. Quite possibly the 
restricted range of intelligence within the sample reduced the size of the estimate of this 
effect. 

Despite various objections that have been made to the studies mentioned above, their 
results are similar to those presented here which show a significant association between 
birth-weight within twin pairs and non-verbal intelligence, as Table 1 shows. Since all the 
studies which have been addressed to this point have shown the same overall result one can 
only assume that fears about weaknesses in the design of some were substantially 
groundless. Indeed, one study which is represented as having no positive evidence to add to 
this point (Shields, 1962) produces a similar result when the appropriate analysis is carried 
out, as is shown in Table 1. The misinterpretation of the Shields’ study (Kamin, 1974, 

p. 168) highlights further misunderstanding by some proponents on each side of the 
controversy. Examination of the data of the present study showed no systematic association 


Table 1. Effect of assignment by birth-weight on correlations between monozygotic twins 





f With 
With assignment 
No. of balanced by birth- 

Test scores pairs Source assignment weight p" 1 — (ruz azto) 
Twins reared together 

Verbal WISC 27 l 0-86 0-89 2-78 347; 

Perf. WISC 27 1 0-91 0-95 4-91 429; 

Full WISC 36 1 and 2 090 0-94 5-28 439; 

and Binet 

DAP 25 3 0-51 0-65 3-09 20-5% 

Raven 46 5 0:79 0-81 1-80 25%, 
Twins reared apart 

Mill Hill 23 4 0-77 0-80 1-98 37% 

Dominoes 23 4 0:71 0-85 4-50 16:57; 





* All significant at P « 0-05. 
Sources. 
1. Willerman & Churchill (USA) 1967. 
2. Babson et al. (USA) 1964. 
3. Scarr (USA) 1969. 
4. Shields (UK) 1962. In only 23 pairs was there sufficient information to indicate with little doubt 
which twin was the heavier. 
5. Present study (Australia). 
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between the size of the difference in weight, absolute or relative, and intelligence within 
twin pairs. Nor did absolute weight appear to interact significantly upon the effect of 
birth-weight differences within pairs. Of the 19 pairs where both twins were heavier than 
2500 g, in 12 pairs the heavier was the more intelligent, and of the 17 pairs where both 
weighed less than 2500 g, the heavier was the more intelligent in 12 pairs. In each group 
there was one pair where both twins weighed the same and scored the same, while for the 
four pairs where one twin weighed more than 2500 g and the other less, the heavier was the 
brighter in two cases. At best such a difference between groups might be claimed to be 
suggestive — but it is certainly not significant. 

Differences in birth-weights with monozygotic twin pairs are taken here simply as a 
crude indication of an underlying variable to which both are related, namely the relative 
favourableness of intra-uterine conditions, and that correlation is no basis for the 
presumption of a direct or casual relationship between birth-weight and intelligence. 

Finally, the variation between samples in the size of correlations (shown in Table 1) is 
likely to be the product of more than just the usual sampling effects. These would include 
variations in the intellectual domains being measured by the different tests used, the degree 
to which the original test standardizations were appropriate for the various national groups 
reported here, and also the differences in the reliability of the tests used. One should also 
be wary in assuming that the degree of this effect will be the same for those who are not 
monozygotic twins, since plausible arguments can be made which support the conflicting 
expectations of either a larger or a smaller contribution than that found in monozygotic 
twins. 

In concluding this section it can be said that intra-uterine factors appear to contribute to 
differences in intelligence within the normal range. That this effect predominates in 
non-verbal tests provides further evidence in favour of the clinical practice of using 
performance scores that are significantly lower than verbal scores as an indication of 
organically based dysfunction of the central nervous system. 


Farther considerations 


In recent years the role of intra-uterine factors in determining substantial intellectual 
subnormality has become understood in greater detail. Preceding parts of this paper 
indicate that they also appear to play a significant part in the determination of intellectual 
variation within the normal range. Upon reflection this does not appear to be an 
extraordinary finding. One need only consider the effects of some already known congenital 
causes. For example, the type and degree of intellectual handicap resulting from rubella is 
directly and specifically related to the time of illness during pregnancy thus establishing a 
continuum of effect from severe to moderate and even mild. To extrapolate such a 
continuum to a point where it is slight but still significant (even if not usually detectable in 
individual cases) before it disappears altogether, does not seem unreasonable. Another 
illustration comes from the effect of delays in commencing treatment for phenylketonuria, 
which results in an average reduction of three IQ points for each month of delay. 

This finding concerning the role of intra-uterine factors raises a further question 
concerning the role of postnatal constitutional factors as further and significant contributors 
to the population variance for intelligence. Brain injury, whether from illness or accident, 
and which is substantial but does not result in death, is recognized as a cause of grossly 
reduced intelligence. Knobloch & Pasamanick (1962), for example, present good evidence 
to show a continuum of such effects from severe to minimal in individual cases. The 
problem, however, is to assess the amount of these effects over the whole population. 

It seems reasonable to suppose that the same basic approach to finding the effects of 
intra-uterine effects mentioned earlier could also provide a way of estimating the amount of 
variability due to postnatal constitutional factors as a first approximation at least. In this 
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case it requires allocating each of the pair of twins according to whether one has had more 
substantial illness than the other, and more particularly whether that illness is known to 
have neurologic sequelae, either directly from penetrating wounds, toxins, or viral 
infections of the cortex, or indirectly from illnesses giving rise to factors with 
neuropathological associations such as oxygen deficit. For a number of reasons the criteria 
for allocation is not as clear cut as it is for birth-weight. Not only because the total effects 
of miany illnesses are incompletely understood, but even in individual cases where they are 
known they may not be fully reported to the person concerned, and ultimately investigators 
would need to go beyond subjective report. Also, significant differences in the health 
histories of pairs of twins are unlikely to be as common as measurable differences in 
birth-weight. 

Since the method required that only substantial differences in health histories be used as 
a basis for partitioning, this estimate is probably conservative. It may be thought that this 
effect could be due to disrupted schooling, but the majority of these illnesses occurred 
before or beyond the years of school attendance. 


Table 2. Effect of assignment according to ae in one on correlations between 
monozygotic twins raised apart? 


Product Intra- 


Correlation moment class t l—(rwzrwap) ` 
Dominoes test 
Illness in all pairs, n = 22 0-74 0-69 1-83* 6-8% 
Mill Hill test 
Illness in all pairs, n = 22 0:72 0-70 0-51 35% 
* P< 005. 


a The data are taken from Shields, 1962. 


The data given in Table 2 are taken from a group of monozygotic twins of normal 
intelligence who were reared apart, and where only one member of each pair had a 
substantial illness or health incident in their past lives. The set of twins who had been ill 
have a lower average level of intelligence than the set of healthier twins. 

When the data are treated in the same way as the previous example showing 
intra-uterine effects, this yields an estimate of variance associated with postnatal illness of 
5:1 per cent, with the non-verbal sources contributing more than twice as much variance as 
the verbal sources. 

The estimate for these effects plus intra-uterine effects amounts to 7 per cent plus of the 
total variance. To date they have been included in estimates of total environmental effects 
for which Burt's (1971) data give 12-6 per cent and Jinks & Fulker’s (1970) analysis of 
Shields’ data gives an average of 28 per cent. (Jinks & Fulker would have made a smaller 
estimate had the data, like Burt's, been corrected for unreliability.) For this type of 
analysis, Jinks & Fulker can be said to provide a conservatively large estimate of 
environmental variance, and given all the queries about Burt's work it can be represented 
as providing a questionably small such estimate. Taking, therefore, the 12-6 per cent arid 
the 28 per cent as the likely lower and upper limits for the range of estimates of total 
environmental effects we can conclude that between one-quarter and two-thirds are 
determined by constitutional factors associated with intra-uterine and health variables. 

Of course the special characteristics of the pre-natal environment of twins must lead one 
to caution in accepting the results too specifically. However, even if these estimates for the 
effects of illness and intra-uterine factors are correct only to the extent that they show an 
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additional group of factors separate from both genetic and cultural factors which accounts 
for a substantial amount of the variance previously allocated to the gross environmental 
category, then changes are required in the reasons that are given for differences in the 
normal range of intelligence. 

Environmental effects have been thought of as being more or less the product of 
cultural-functional processes, and physical-organic considerations have tended to be put to 
one side in so far as the normal range of intelligence is concerned. The present analysis 
shows, however, that congenital factors made up of hereditary and intra-uterine factors 
taken together reduce the relative contribution of purely environmental factors. Moreover, 
these remaining environmental factors are not exclusively of a cultural kind, but also have 
physical-organic or constitutional elements. This is not to deny the possibility that the two 
are related in some way, or sometimes are so interwoven as to defy separation. In the. 
present case, however, the physical-organic source is the more direct and specific. In 
addition these organic factors with pathological associations appear to operate in a 
negative direction only, by reducing constitutional potential and, in turn, reducing the 
likely influence of purely cultural factors still further. 

This finding should not be taken to mean, however, that genetic effects are therefore the 
dominant influences in determining any differences in intelligence between groups of 
various kinds. The negatively biased influences of non-genetic constitutional sources may 
well override both cultural and genetic effects. Thus it seems hazardous to think of the 
relationship of genetic, antenatal, other constitutional and cultural factors as being fixed 
within groups; since between groups, places and times there could well be fluctuations in 
their relationships that are functions of different levels of mortality and morbidity in each 
group. It may well be fallacious, therefore, to assume that differences in genetic estimates 
are the result only of sampling differences or different treatments by investigators. Genetic 
estimates taken from one part of society may not be totally valid for another, and 
interpretations of differences which use such estimates should temper any conclusions at 
least by taking account of differences in the standards of health of the groups being 
compared. While the conclusions in this paper have obvious implications for intervention 
programmes in the area of social differences, they also go beyond that. 
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Adult age differences in multi-source selection behaviour with partially 
predictable signals 


A. J. Maule and A. J. Sanford 














When important information arrives intermittently at spatially separate sources people must divide 
attention if they wish to assimilate it. Previous research has demonstrated that, under appropriate 
circumstances, more attention will be allocated to those sources where the information is more likely 
to occur. Some recent studies have suggested that this selectivity of behaviour is found to a much 
lesser extent in the elderly, though such results are ambiguous because of methodological 
inadequacies. The present paper introduces a new method for investigating the allocation of attention 
which avoids these previous difficulties. Nevertheless, the results do demonstrate an age-related loss 
in selectivity, suggesting these findings are reliable and of considerable generality. It is suggested that 
the elderly are poorer processors of stochastic information, and that in the present task this may be 
due to a reduced capacity for remembering the moment to moment probabilities of information being 
presented to each of the sources. 











When a man has to sample several separately located sources of information in order to 
check for critical events, he must adopt some kind of sampling strategy. Dnving a motor 
car is a good example of such a task: it is only by the appropriate deployment of visual 
attention that adequate driving is possible. Crossing the road, and industrial tasks involving 
multiple dial watching provide other everyday instances. Researchers investigating these 
sorts of situations have identified the importance of determining the relation between the 
sampling of a source and the frequency with which critical events occur to that source. 

Hamilton (1969) investigated the nature of attention allocation in this situation by using 
the instrumental observing response (OR) as an index of sampling. Subjects were required 
to detect the presence of signals which periodically appeared at one of three sources, with 
signals presented to sources in proportion 6:3:1. A signal could be detected only while a 
response button was depressed (instrumental OR), and the equipment was designed so that 
each OR provided only a brief view of the state of the source. Since each OR was of brief 
duration, a wish to continuously observe the sources could only be satisfied by making a 
continuous stream of discrete responses. Hamilton was primarily concerned to determine 
whether subjects would make more samples to those sources where signals occurred more 
frequently, i.e. show selectivity between sources. 

His results showed that when the overall rate of signals per session was relatively slow, 
subjects’ sampling behaviour was not selective, and all sources received equal attention. 
However, if the overall number of signals per session increased, or when subjects’ overall 
response rates were slowed by pacing (e.g. they were allowed to make only one observation 
every 4 seconds), subjects became selective and responded more often to those sources 
where signals occurred more frequently. Both of these changes in experimental conditions 
increase the probability of a signal being found per observation, and Hamilton argued that 
it was the value of this probability which was critical in determining selectivity. 

Sanford & Maule (1971, 1973a) were interested in the performance of old people in these 
sorts of situations. They argued that sampling is a vital component of many skills and 
therefore may be important when explaining the frequently reported age decrement in 
skilled performance. In addition, they also suggested that this experimental procedure 
provides a method for testing Griew’s (1965, 1968) theory of compensatory strategies. 
Griew stated that elderly people may show an increased tendency to utilize the probability 
structure of sequences of events, enabling them to predict the occurrence of these events, 
thereby offsetting the effects of failing capacities like the ability to act quickly. 
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Sanford & Maule (1971) argued that if Griew’s theory is true, then old subjects should 
exhibit greater selectivity than young subjects in the Hamilton OR situation. In a series of 
experiments (Sanford & Maule 1971, 19734) the results showed that the old were 
consistently less selective than the young, and this was established across a variety of overall 
signal rate, and pacing conditions. These findings are not only evidence against Griew's 
theory, but suggest an important area of age-related changes in behaviour which may have 
implications for other areas of the psychology of ageing. Unfortunately, these results are 
open to alternative interpretation and demand further comment. 

Although it is superficially a good task for investigating the nature of attention 
allocation, the Hamilton task may not be particularly suited to the investigation of age 
differences. A number of studies using this task have shown that when there is no pacing 
restriction in operation, young subjects respond extremely quickly, making a continuous 
stream of brief observations throughout the session (Hamilton, 1969; Sanford & Maule, 
1971, 1973a; Hockey, 1973). 

Sanford & Maule (19734) argued that at this fast rate of responding it was highly 
unlikely that the time between responses was sufficient for subjects to make an evaluation 
of current signal probabilities before every response. They concluded that there were two 
different kinds of observing response. One class of response occurred without any 
consideration of information about signal probabilities, and was therefore likely to be 
allocated to the three sources at random. The longer the period of this random responding, 
so the more equal will the distribution of responses be between the three sources. A second 
class. of observing response appeared to be based upon an evaluation of current signal 
probabilities, with subjects observing whichever source was assessed as most likely to 
present the next signal. The evidence for this class of observing was that subjects exhibit 
selective behaviour. The only difference between sources was in terms of signal presentation 
associated with each, and since results show that subjects consistently made more responses 
to those sources where signals were more likely, this suggested that signal probabilities did 
partly control subjects’ observing behaviour. Sanford & Maule argued that measures of the 
overall selectivity of observing behaviour depended upon the relative proportions of these 
selective and non-selective components. 

Sanford & Maule (1973a) reported a similar pattern of rapid responding in elderly 
subjects and there are two reasons why such fast observing may cause reduced selectivity in 
the elderly. First, there is considerable evidence to suggest that there is a general slowing of 
cognitive activity with advancing age. It is likely that this will be so for evaluating critical 
signal probabilities, and therefore proportionately fewer observations will be based upon 
this information. This will lead to an increase in the proportion of random, equi-distributed 
responding, thereby diluting the overall measure of the selectivity of observing. Second, the 
present task induced all subjects to respond extremely rapidly. It could be that the 
performance of old subjects is disrupted by this need to respond rapidly, since fast 
responding has been identified as being of particular difficulty for the elderly. This task 
requirement may have made such demands upon available capacity that old subjects have 
little available for evaluating current signal probabilities. If this is so there will be a 
reduction in the proportion of observing based upon signal probabilities and thereby a 
reduction in the overall selectivity of observing behaviour. Both these suggestions show 
that previously reported age differences may be explained by a requirement for rapid 
response rates. However, a requirement to respond rapidly is an artificial constramt which 
was built into previous laboratory investigations of sampling behaviour, rather than a ` 
general characteristic of all everyday situations in which sampling is required. Therefore, it 
is difficult to determine whether the reported age effects are general findings or are simply 
restricted to laboratory investigations where subjects have been required to observe rapidly. 
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Support for a more cautious interpretation of the previously reported age differences 
comes from other data reported by Sanford & Maule (1973a). They showed that differences 
in selectivity between young and old subjects were considerably reduced under paced 
conditions in which the overall rate of responding was reduced by allowing subjects just 
one observation every 4 s. Though age differences reduced when subjects did not respond 
rapidly, the significant difference which remained requires explanation. It is possible that 
the pacing restriction may also have had a detrimental effect upon the performance of old 
people because it required them to concentrate upon when they should respond at the 
expense of evaluating to which source the next response should be directed. Much evidence 
suggests a reduced channel capacity in the elderly (e.g. Talland, 1968), and the demands of 
pacing, requiring observation only at certain points in time, may leave insufficient capacity 
for evaluating individual source probabilities. This may also increase the likelihood that old 
subjects will distribute their responses randomly across sources. Thus the age-related 
differences in selectivity reported by Sanford & Maule may simply be a further 
demonstration that the behaviour of old subjects is disrupted by a requirement either to 
respond rapidly or to respond only at certain points in time (i.e. paced responding). The 
results may not provide evidence for any fundamental changes in sampling behaviour with 
advanced age. 

In order to evaluate these different interpretations of the Sanford & Maule data it was 
necessary to design a new experimental procedure for measuring sampling which 
incorporated neither response pacing nor the need for rapid responding. In choosing an 
alternative it was decided to adopt a procedure which in many ways is more akin to 
everyday situations. In particular, subjects were allowed to control the duration of each 
observation of an information source. If subjects showed any general tendency to make 
relatively long observations of sources, then the overall rates of responding should be 
slower, and this would be achieved without any artificial constraint like pacing. 

So as to provide a task better suited to evaluate age differences in sampling behaviour, a 
three-source observing response task with two important modifications was used. The first 
modification was in the stimulus information presented to subjects. Rather than 
unpredictable, all-or-none signals, each source consisted of a dial over which a pointer 
moved in steps towards a critical final level. Thus, an observation of a source provided two 
different kinds of information — whether a signal was present and, if not, how imminent it 
was. In many everyday situations critical events are preceded by stimuli which increase the 
predictability of these events (e.g. changes in the altimeter reading prior to a plane reaching 
8 desired height). This modification was designed to broaden the scope of experimentation 
in monitoring situations. 

The second modification was in the way subjects observed the information sources. 
Unlike most previous studies, subjects were not obliged to take only very short, discrete 
observations, but were allowed to observe each source for as long as they liked. In a 
previous study where subjects were allowed to determine the length of each response, it was 
found that they persisted in making only very many brief observations (Blair & Kaufman, 
1959). However, a pilot study with young subjects using the present experimental design 
showed that it induced a very different pattern of observing behaviour, with a tendency for 
much fewer, relatively long observations. These long observations were distributed 
throughout the session and subjects did not simply watch a source from the time a signal 
was imminent until it actually arrived. This general tendency to make longer responses 
throughout a session resulted in a decrease in the overall rate at which subjects observed 
the sources. Should this tendency to increase the duration of observations and reduce the 
rate be evident in old subjects, then the present task may be more appropriate for assessing 
potential differences in sampling behaviour in the young and old. The greater flexibility in 
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observing afforded by the present task should provide a better procedure, since it does not 
impose constraints likely to adversely affect the older subjects. 


Method 
Subjects 


Twenty-four young subjects were recruited from the student population at the Dundee College of 
Education and were of average age 22:2 years (range 17-28 years). The 24 old subjects were members 
of the Medical Research Council Panel at the University of Liverpool, and were of average age 69-75 
years (range 65-79 years). Both age groups comprised 16 females and eight males and were 
reasonably matched educationally, with the old having a slightly higher Mill Hill vocabulary score 
(36:1) than the young (32:3), a difference which is significant (U test, P < 0-05). 


The task 


The task was an analogue of an industrial situation with subjects required to monitor the states of 
three information sources, each described as the controls of an oven cooking a particular bakery 
product. Each information source consisted of a box on which was situated two response buttons, 
one below the other, and a shutter behind which was a dial with a pointer and graduated scale. The 
boxes were arranged ın a line in front of the subjects as shown in Fig. 1. 





Figure 1. The three information sources (i.e. ovens) with machine 1 currently being observed, with the 
shutter raised showing that the pointer is a level 4 in the cooking cycle. 


Subjects could determine the state of any product 1nside the oven by pressing the lower of the two 
response buttons, which was labelled ‘look’. For the period that the ‘look’ button remained 
depressed a solenoid kept the shutter raised allowing the subject to observe the position of the pointer 
on the dial. When the button was not depressed, the dial was hidden behind the shutter. 

The dial was graduated into 10 different levels. The pointer started at the extreme left-hand 
graduation and, at unpredictable points in time, moved towards the right, one step at a time. The 
final point, level 10, had a yellow background, unlike the rest which were white. Subjects were told 
that levels one to nine represented progressive cooking stages, and level 10 signified the completion of 
the process From the subjects’ point of view, sources were to be discovered immediately they reached 
level 10, or as soon afterwards as possible, so that a new batch of product could be started ın the 
oven (1.e. restart a new cooking cycle). This was achieved by pressing the second (upper) response 
button, labelled 'reset', which set the pointer back to level one on the dial. Thus subjects were 
required to check the states of ovens by depressing the 'look' button, and to minimize the amount of 
time any oven was out of action due to the completion of the cooking process. They could only 
check one source at a time. 

In order to emphasize the importance of minimizing the time for which any oven was at level 10, a 
counter was attached to each source and this incremented by one unit per second during this period. 
The counters were labelled ‘£s lost’ and were said to represent the money lost due to the suspension 
of production on that machine. These counters were visible to subjects during the instruction phase, 
but were kept out of sight during the experiment proper. The totals served to provide end-of-session 
feedback. 
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During the session each oven completed a different number of cooking cycles, and these were in the 
ratio 6:3:1 for the three machines. The pointer was actually a voltmeter, and its movements were 
controlled by a paper tape reader. Each signal pre-recorded on the tape incremented the voltage level 
on the meter thereby cumulatively stepping the pointer until the maximum level was reached. This 
level was maintained, and all further signals were ineffective until the subject had detected this state. 
The intervals between each step of the voltmeter were variable. The average interval between each 
movement from level n to level n+ 1 (hereafter called advance signals), and the range of intervals used 
are presented in Table 1. The program controlling intervals between advance signals was constructed 
in six independent 5 min blocks. In any block, the sequence of intervals for each source was derived 
by randomly selecting (without replacement) from equal numbers of the intervals. This produced an 
advance signal distribution which was rectangular in character on all three sources. In addition, the 
mean number of level 10 states (hereafter called events) for each source is also presented in Table 1. It 
should be recognized that there are exactly nine advanced signals per event, and this remains 
constant across all conditions. In the present experiment two overall event rates were used, called 
Slow and Fast, since this has been found to be an important determinant of selectivity in' tasks of this 
kind. 


Table 1. The number of events, the mean inter-advance signal interval and range of intervals, 
for each source at the slow and fast event rate conditions 


Slow Fast 








Source A Source B Source C Source A Source B Source C 


Event frequency 66 33 11 132 66 22 

Mean inter-advance signal 3 6 18 15 3 9 
interval (s) 

Inter-advance signal 1-5 -2-10 6-30 0:5-25 1-5 3-15 


interval range (s) 


The frequency with which subjects looked at each source was recorded by a counter wired to the 
‘look’ button. The duration of each observation was measured by means of a multi-channel pen 
recorder. This procedure provided a measure of the observing behaviour of subjects who were set to 
devise the best way of allocating their ‘looks’ between the three sources so as to reset the machines at 
the moment the product inside was ‘cooked’, or as soon as possible after this, thereby minimizing the 
money lost over a session. 


Procedure 


All subjects attended two, 1 hour sessions not more than 2 days apart. The testing phase lasted half 
an hour, the remaining tıme being devoted to instructions and a post-test interview where subjects 
talked in general terms about the task and were asked to assess the individual source event 
probabilities by saying how 10 such events would normally be distributed across them. 

Each age group was split into two subgroups with each run at one of the two overall event rate 
conditions (1.e. slow or fast). Within each overall rate condition subjects monitored three sources, one 
of which presented signals at a relatively fast rate (Source A), one at a medium rate (Source B) and 
one at a relatively slow rate (Source C). These three rates were programmed independently to the 
sources and produced differential event probabilities of 0-6, 0-3, 0-1 respectively, just as those used by 
Sanford & Maule (1971, 1973a). The assignment of these different rates to the three sources was held 
constant for any subject, and the exact configuration of rates to machines was made on a latin square 
basis so that signal rates were counter-balanced across sources. 

Subjects were instructed on the purpose of the task using the cooking analogy throughout. It was 
pointed out that it was efficient to observe a given source as the pointer moved into the final sector, 
but, whilst observing one source, they should be wary in case, meanwhile, another source reached the 
critical point undetected. Also, during the instruction phase, the ‘£s lost’ counters were shown to 
subjects to emphasize the importance of resetting a machine as soon as it reached the final sector. At 
the end of the second session all subjects were briefly interviewed in order to ensure that they fully 
understood what the task was about. DEL EAR 
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Results 


Since the present study was concerned only with stable asymptotic performance of the two 
age groups, only data from the last 15 min of the second session were analysed. Sanford & 
Maule (1973a) found that observing behaviour stabilized with this amount of practice. 


Frequency and duration 


It was anticipated that the present task would induce subjects to respond with fewer, but 
relatively longer, responses than those reported in previous studies. Table 2 shows the 
mean number and the median duration of these responses averaged across subjects in each 
group. The median was selected in preference to the mean since a preliminary analysis of 


Table 2. The mean number of responses and the median durations of responses to each 
source (s) for young and old subjects at two event speeds 





Median duration of responses (s) 
Mean number 








of responses Source A Source B Source C 
Slow 
young 201 3-78 2:26 0-99 
old 211 3-51 2-75 1-44 
Fast 
young 261 3-30 1-63 0-72 
old 248 2-39 2-83 1-22 


the distribution of response durations suggested they might be approximately exponential 
in character, with some responses lasting longer than 20 s. Figure 2 shows the overall 
distribution of all response durations made by the two age groups at each event rate 
condition. These distributions were calculated by counting the frequency of observations 
between 0—0-99 s, 1-00—1-99 s, 2:00... up to 16:00—16-99 s with a final category of 17-00 s 
and over. The frequencies of occurrence within each category are grand totals for all 
subjects within that group. Though the data are not presented here, it is important to note 
that the general form of this distribution is evident when considering observations to 
individual sources. The significance of such response distributions are not of direct concern 
here, and will be discussed in a later publication (Maule, in preparation). However, these 
results demonstrate that all subjects make relatively long observations, and there is also a 
good deal of variability in the duration of these responses. 

A two-way analysis of variance of the mean number of responses made by each group 
revealed no effect of age (F — 0-00, d.f. — 1,44) nor signal speed (F — 2-22, d.f. — 1,44) nor 
any interaction (F = 0-13, d.f. = 1,44). Thus, overall rates of responding were similar for 
both age groups and across both event conditions. The median duration of observations to 
the three sources made by each group of subjects were investigated using a three-way 
analysis of variance. The analysis evaluated the effects of two between-subjects factors of 
age and event speed, and one within-subjects factor of sources (i.e. comparing across the 
three information sources). Table 3 summarizes this analysis. 

It can be seen that there were no effects attributable to age or event speed, nor any 
interaction between these two factors. There was, however, a highly significant effect 
attributable to sources (P « 0-001), and Table 2 shows that the median duration of 
observations was longer to sources at which events were more frequent. Of the remaining 
interaction terms, only the age x sources was significant (P « 0-05), and Table 2 provides 
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130 130 
120 120 
110 110 
100 100 
90 90 
80 Young slow 80 Young fast 
70 70 
60 60 






130 01234567 8 91011121314151617 130 
‘ and over 


01234567 8 91041121314151617 
and over 


Old fast 





01234567 891011121314151617 i 0123456 7 8 91011121314151617 
and over and over 


Figure 2. The response duration distributions, represented in terms of the frequency (F) with. which 
subjects in each group made observations between 0—0-99 s, 1-00—1-99 s. . .up to 16-00-16-99 s, and 
17-00 s and over. 
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Table 3. Analysis of variance comparing the median durations of observing the three 
sources for subjects at two ages and two event speeds 











Source of variation SS d.f. MS F 

A (age) 21049-2 1 21049-2 0-46 

B (event speed) 69388-3 1 69388-3 1-53 

' AB 158-3 1 158-3 0-00 

AB error term 1995523-8 44 45352°8 

C (Information Sources) 1125711-8 2 562855-9 29-38*** 

AC 133024-3 2 66512-1 3-47* 

BC 24074 0 2 12037-0 0-63 

ABC 27473:3 2 13736-7 0-72 
ABC error term 1685776°6 88 19156-6 

Total 5082179-7 143 








* P«005; *** P «0-001. 


an explanation of this effect. Young subjects tended to make longer observations to source 
A and shorter observations to source C than did the elderly. Both age groups made 
relatively longer observations where events were more likely to occur, but this occurred to 
a greater extent in the young. It appears that 1f differentiation between sources is 
considered in terms of the length of observations to each source, the elderly differentiate 
between sources less well than the young. 

The gross response profiles for the two age groups show some very interesting trends. 
Subjects made fewer, relatively long observations as compared with all previous studies. 
The rates of responding lie between 13 and 17 responses per minute and therefore far 
outside the range where speed stress problems are likely to disrupt the performance of old 
subjects. 


Selectivity of observations 


In previous studies each observation was very brief and of relatively invariant duration, 
and so the attractiveness of a source might be measured by calculating the relative 
frequency with which that source was observed. Indeed, the proportion of observations 
directed to a source was equivalent to the total time spent observing that source. In the 
present task, duration of observations varied so that the frequency of observing and the 
time spent observing were not necessarily equivalent. The most sensitive measure of 
selectivity was likely to be the amount of time spent observing a source and this was taken 
as the primary index. The frequency measure was less sensitive since a continuing wish to 
Observe a particular source could be achieved by making a single relatively long response 
rather than several short ones. 

Figure 3 shows both the proportion of total response time spent observing each source 
and the relative proportion of occasions that each source was selected. A first check for 
selectivity was made by testing whether the rank order of preferences for responding to 
sources is comparable to the rank order of event presentations to the sources. This relation 
was tested using the Friedman test (Siegel, 1956, pp. 166) on the data for each of the 
groups separately. The resulting values of Xr? were significant for all conditions (P < 0-01 
and better for all conditions), demonstrating that subjects spend relatively longer observing, 
and look more frequently at, those sources where events were more likely. To determine 
whether there are selectivity differences between the various conditions a two-way analysis 
of variance was applied to the data taking responding to each information source 
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Figure 3. The proportion of all observing which was made to each of the three sources, considered 1n 
terms of both the duration and the frequency of responding: @, fast event speed; W, slow event speed, 
——-, young subjects; ———, old subjects 


separately. In terms of the proportion of time spent observing source A, there was a highly 
significant effect of age (F — 25-41, d.f. — 1,44, P « 0-001), yet no effect of event speed 
(F = 0-00, d.f. = 1,44) nor any interaction (F = 1:78, d.f. = 1,44). This result shows that 
young subjects spend proportionately longer than the elderly observing source A. The 
measure of the proportion of time on source B also revealed a significant effect of age 

(F = 5:19, d.f. = 1,44, P < 0-05) but no effect of event speed (F = 0-07, d.f. = 1,44) nor 
interaction (F — 0-97, d.f. — 1,44). Similarly, the proportion of time observing source C 
showed a significant effect of age (F — 9-44, d.f. — 1,44, P « 0-01) without any effect of 
event speed (F = 0-08, d.f. = 1,44) or any interaction (F = 0-08, d.f. = 1,44). These results 
demonstrate that old subjects are less selective, and their reduced tendency to spend time 
observing the frequent source A was complemented by a greater tendency to observe the 
relatively infrequent sources B and C. 

Measures based upon the relative frequency of observing sources were also subjected to 
an analysis of variance with responding to each source considered separately. The results 
showed that for source A there was no effect of age (F = 0-44, d.f. = 1,44) nor of event 
speed (F = 0-32, d.f. = 1,44) nor any interaction (F = 0-21, d.f. = 1,44). A similar pattern 
of results was evident for source B, with no effect of age (F — 0-18, d.f. — 1,44), nor event 
speed (F = 1-22, d.f. = 1,44) nor any interaction (F = 0-18, d.f. = 1,44). The same was also 
true for source C, with no significant age effect (F — 0-02, d.f. — 1,44) nor event speed effect 
(F = 0:35, d.f. = 1,44) nor any interaction (F = 0-01, d.f. = 1,44). Thus all subjects make 
more observations where signals occur most frequently, but the distribution of observations 
between sources was similar for all the groups. 
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Efficiency of performance 


A measure of the optimality of sampling is the average time that events remain undetected. 
Comparisons between age groups required some correction since old subjects tended to 
execute a reset response more slowly, thereby increasing the average durations of their 
correction lags. This was allowed for by determining the average time taken to reset when 
subjects were observing the source when an event occurred. This correction latency was 
subtracted from the total time lost so that a corrected average time lost per event was 
calculated for each group. The corrected data are presented in Table 4 and were subjected 
to a two-way analysis of variance. Results showed a significant main effect of age 


Table 4. The average time lost per event (s) for young and old subjects at two overall event 











rates 
Young Old 
Slow 0-45 0-62 


Fast 0-64 0-89 


(F = 8-38, d.f. = 1,44, P < 0-01) but no effect of event speed (F = 1-06, d.f. = 1,44) nor any 
interaction (F = 3-01, d.f. = 1,44). Thus old subjects allowed events to remain undetected 
for longer than their juniors and this represents less optimal performance. 


Knowledge of system probabilities 


It was important to determine whether both age groups were equally successful in learning 
that the different sources carried different event rates. To this end, subjects were asked to 

assess how 10 events would usually be distributed between the three sources, and the means 
for each group are presented in Table 5. All groups were realistic in their assessments, even 


Table 5. Estimates as to the proportion of events to each source (out of 10) for young and 
old subjects at the end of the first and second sessions for the two event rate conditions 














Slow Fast 

Session One Session Two Session One Session Two 

A B C A B C A B C A B C 
Young $8 3-1 1:2 57 32 12 53 33 1:2 57 33 1-0 
Old 56 30 1:4 53 33 1:3 53 32 L5 56 31 1:3 


by the end of the first session. All subjects correctly rank ordered the sources with respect 
to event frequency and there was apparently no difference between the age groups. 


Discussion 

Young and old subjects made relatively long observations, and so responded at a 
comparatively slow rate. It is, therefore, highly unlikely that the performance of the elderly 
subjects was disrupted by any factors related to the problems of fast responding. Despite 
this, old subjects demonstrated a reduced selectivity of observing of the high frequency 
source. Reduced selectivity therefore seems to be a feature of the sampling behaviour of the 
elderly in monitoring tasks. Age differences do not result from a failure of the elderly to 
have an appropriate internal representation of the relative probabilities of critical events 
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across sources. Both groups subjectively assessed source probabilities equally accurately. 
Thus, on the basis of all studies to date, we may assume that sampling behaviour in the 
elderly is less selective. It is so, regardless of whether the predictability of critical events is 
increased by added information, or whether subjects observe the sources by making many 
short responses, or fewer, relatively long ones. In the present experiment, this reduced 
selectivity of sampling in elderly subjects can be represented in terms of differences in more 
gross features of observing. Though all subjects make relatively longer observations of 
those sources where events are more probable, this strategic allocation of sampling time is 
less marked in the elderly. These results also provide direct evidence that the behaviour of 
the elderly is less optimal, since they allow critical events to remain undetected for longer 
than young subjects. 

These findings are consistent with other studies which have investigated age differences in 
situations requiring subjects to choose between alternatives which occur with different 
probabilities. For instance, it has been found that old subjects are less likely to choose the 
most frequently occurring alternative in a probability learning task with a financial reward 
for correct guesses (Sanford & Maule, 19735), and in a similar task when knowledge of 
results is not given on a proportion of tnals (Sanford et al., 1977). The only situation of 
this kind not to produce an age difference in behaviour is a simple probability learning task 
without reward (Sanford et al., 1972). Overall, the evidence suggests that old subjects in 
these sorts of laboratory tasks behave less optimally, and that their ability to utilize the 
probability structure in sequences of stimuli is much poorer. This is contrary to Griew's 
theory of compensatory strategies, and we must conclude that his theory is an 
inappropriate way of describing the behaviour of elderly people. 

The present experimental procedure differs from previous studies in a number of 
interesting ways. Perhaps the most important difference is that extra or added information 
increases the predictability of critical events. It is of considerable theoretical interest to 
determine how subjects utilize this extra information. Maule (1976) varied the amount of 
added information provided to young subjects in a monitoring situation like the present 
one. In general his results showed that their performance became more selective as the 
amount of added information was increased. It seems therefore that young subjects may 
use added information to increase the selectivity of their monitoring. 

There must, however, be some doubt concerning the ability of old subjects to take 
advantage of this added information. Rabbitt (1964) reports a study investigating the 
effects of extra information in a choice reaction-time task with variable fore-periods. This 
extra information could reduce uncertainty associated with either the next fore-period 
duration, or the next reaction stimulus likely to occur. When present, the information 
reduced the reaction times of young subjects, but not for the elderly This result is only one 
example of a general reduction with age in the ability to utilize information about the way 
signals are presented (see Rabbitt, 1968). Thus it seems unlikely that old subjects will be 
able to use the added information in the present task as effectively as the young and this 
could be partly responsible for the age differences we found. However, more evidence is 
needed concerning the way different age groups handle added information in monitoring 
tasks. 

Sanford & Maule (1973a) consider two theories to explain age differences in selectivity, 
which differ in terms of the levels of explanation they adopt and do not represent 
conflicting approaches. The first theory relies upon the fact that low states of arousal (e.g. 
as induced by sleep deprivation) reduce attentional selectivity (Hockey, 1973). If older 
subjects are assumed to be in a lower state of arousal than the young, then perhaps this 
explains age differences in selectivity. However attractive this appears, it is little more than 
an interesting speculation since it amounts to mapping one ill-defined variable (ageing) into 
another (arousal state). 


80 A.J. Maule and A. J. Sanford 


The second theory focuses upon differences in confidence that young and old subjects 
have about information concerning when critical events are going to occur. It is assumed 
there are periods when this information suggests that one of the sources is most likely to 
present the next critical event, and subjects.prefer to observe this source during the period. 
Over the session, responding based upon this information is selectively distributed since it 
most often favours the source where signals occur most frequently. In contrast, when there 
1s insufficient information there is no rational basis for choosing which source to observe 
and responses are randomly allocated. Sanford & Maule (1973 a) argue that the quality of 
information varies from very ambiguous to strongly suggestive that one source will present 
the next critical event. Subjects must determine when the information is sufficiently reliable 
to use. If old subjects are less confident, they require information to be more reliable before 
using it. In consequence, old subjects will spend more of the sessions responding randomly 
and their monitoring behaviour will be less selective. 

A third possible explanation may be constructed by considering the role of memory in 
monitoring situations. It seems likely that a crucial component of monitoring behaviour 1s 
a regularly updating memory system for each of the sources. The contents of this memory 
represent the subject’s expectation as to the current state of that source. This is dependent 
upon the state of the source on the last observation, plus an updating based upon the time 
since the last observation and the changes in state likely to have occurred during this 
interval. By inspecting the contents of the updatable memories and comparing them with 
the information from the source currently being observed, subjects may decide which 
source to observe next. It has often been reported that the memory of the elderly is far less 
efficient (see Eysenck, 1977 for a review), and a similar deficit in an updatable memory 
system would necessarily reduce monitoring efficiency. In particular, old subjects would 
have less reliable information concerning the states of the sources currently not being 
observed, and may not leave these sources unsampled for so long. This would necessarily 
lead to a more even distribution of responses between the sources, and thereby account for 
the decreased selectivity in the elderly. Though this is an attractive explanation, we need 
more evidence to assess the role of memory in monitoring tasks. 

We now have better evidence on which to base an adequate theory of the sampling 

‘deficit in the elderly. Apart from further investigation of this effect, it remains to identify 
the implications of this finding for the everyday behaviour of the elderly, particularly when 
involved in such activities as driving. 
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Effects of cigarette smoking on immediate memory and performance in 
different kinds of smoker 


D. Gareth Williams 





Changes in performance resulting from smoking were assessed in 48 male cigarette smokers who were 
classified both by daily consumption and by relative desire for smoking in situations inducive of high 
or low arousal. Each took part in four first cigarette of the day conditions involving a sham smoking 
control and three cigarettes differing 1n nicotine delivery. With increasing cigarette strength gains in 
letter cancellation speed on smoking increased although an inverted-U relationship was suggested, 
immediate memory accuracy progressively deteriorated once pre-smoking performance was controlled 
for. Smokers with greater desire to smoke in situations inducive of low arousal appeared to react 
more strongly to cigarettes and showed superior gain in cancellation speed on smoking. High 
cigarette consumption did not lead to between-day tolerance to cigarette effects. 





Studies of the effects of cigarette smoking on performance and studies by experimental 
psychologists on arousal and performance will benefit by cross-fertilization. Studies of 
arousal and performance will be more firmly established if conclusions drawn largely from 
time-of-day and noise studies prove to be more generally applicable. Changes in ‘arousal’ 
affect different tasks differently, and different measures of performance efficiency, i.e. speed 
and accuracy, may show different functions (Hockey & Colquhoun, 1972; Folkard, 
1975; Hamilton et al., 1977). Short-term memory seems particularly vulnerable to 
increments in arousal level (Folkard et al., 1976), deteriorating while other tasks involving 
a more immediate processing or ‘throughput’ of information are improving (Hockey & 
Colquhoun, 1972). Although studies specifically of short-term memory and arousal are 
more broadly based than time-of-day studies nevertheless arousing white noise or arousing 
material to be recalled are not exhaustive extensions, while the precise nature of the 
relationship and its explanation remain contentious (Craik & Blankstein, 1975). On the 
other hand the scattered studies of cigarette smoking and performance reporting 
unexpected (e.g. Cotten et al., 1971) or complex relationships (e.g. Friedman, 1972; 
Andersson, 1975a, b) are in need of integration through the insights gained in these other 
areas. That such considerations are relevant is indicated by Friedman's (1972) finding of a 
facilitative effect of smoking upon mental arithmetic speed whereas accuracy declined. The 
position for short-term memory is more obscure. While Andersson (19755) found a 
“deterioration in verbal rote learning in the short term, Andersson & Hockey (1977) found 
deterioration to occur with incidental but not intentional memory in an immediate serial 
recall task after smoking. 

Neither of these latter studies made distinctions between kinds of smoker nor indeed 
have most of the few experimental studies of cigarette effects. However, cigarette smokers 
are not an homogeneous group. People smoke for different reasons and in different 
preferred situations (McKennell, 1970) and thus may achieve different benefits. Although 
indicating the ultimate reconciliation of different classification schemes, McKennell (1973) 
has demonstrated the complexity required in an adequate typology of cigarette smokers 
and their motives for smoking. However, Frith (19714a) in his Situational Smoking 
Questionnaire (SSQ) has offered a simple distinction between desire to smoke in situations 
inducive of low arousal such as those concerning relaxation, boredom, or bodily tiredness, 
and desire to smoke in situations inducive of high arousal such as those involving 
emotional stress or mental activity. This distinction, while focusing attention on the 
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arousal-changing effects of cigarettes, was suggested as probably similar to McKennell’s 
(1970, 1973) distinction between ‘relaxation’ and ‘nervous irritation’ smoking, respectively 
However, the higher the daily cigarette consumption the greater the desire to smoke in all 
situations (Frith 1971 a). The simple expedient of expressing smoking desire in one type of 
situation as a ratio of desire in the other produces a description of smoking pattern 
potentially independent of average consumption, allowing the separate assessment of the 
effects of both features. 

Myrsten et al. (1975) have provided the first experimental evidence that different types of 
smoker do indeed obtain different benefits from smoking. Using an unpublished 
combination of Frith and McKennell items for selection, they compared eight extreme 
‘low-arousal’ situation smokers with eight ‘high-arousal’ smokers in an experimental-task 
situation of each type. Performance and general well-being of the low-arousal smokers were 
favourably affected by smoking only in the low-arousal test situation, whereas high-arousal 
smokers were favourably affected only in the high-arousal situation. While Myrsten et al. 
were essentially concerned with the modulation by smoking of different situation effects their 
results raise a fundamental question implicit in any distinction between smokers. If smokers 
gain different benefits from smoking, why? Through adjustments to smoking behaviour 
(Ashton & Watson, 1970; Frith, 19715; Fuller & Forrest, 1973; Clark, 1975) and 
consequent alterations in nicotine delivery and concomitant psychological features, smokers 
seem able to self-titrate for desired effects within a range of possible outcomes. Are there in 
addition different reactions to nicotine immanent to different kinds of smoker? By 
systematically varying cigarette nicotine delivery and by requiring relatively standard 
behaviour through experimenter-paced smoking in the manner of Agué (1973) a more 
direct approach to this question seems possible. 

This study therefore was of the immediate effects of the first EE of the day in 
smokers distinguished by firstly, average daily consumption or degree of smoking, and 
secondly, relative desire for smoking in high or low arousal situations. The first cigarette of 
the day has maximum impact (Frankenhaeuser et al., 1968). Changes in performance were 
assessed in the same situation by two tasks chosen to make minimal and high demands 
respectively upon immediate memory; letter cancellation depending predominantly upon 
speed, and immediate memory for digit strings depending predominantly upon accuracy. It 
was predicted that (i) with increasing nicotine delivered gains in cancellation speed should 
improve with the possibility, after Blake (1971), of an inverted-U relationship emerging at 
high levels of nicotine-produced arousal, (ii) after Hockey & Colquhoun (1972), as 
cancellation speed 1mproves immediate memory efficacy should progressively deteriorate, 
(iii) with increasing average daily consumption tolerance for cigarette effects should 
increase, thus the effects of cigarettes should be more apparent in ‘light’ smokers than 
*heavy' smokers, (iv) after Myrsten et al. (1975), smokers with greater desire to smoke in 
high-arousal situations should differ in cigarette response to those preferring to smoke in 
low-arousal situations. 


Method 
Design 


The experimental design was a 3 (degree) x 2 (type) x 4 (cigarettes) factorial with repeated measures 
on the cigarette treatment factor. Subjects were classified by degree of smoking into (i) light smokers, 
15 or fewer cigarettes on average smoked per day, (u) medium smokers, 16-25 cigarettes/day, (iii) 
heavy smokers, more than 25 cigarettes/day (cf. Lee, 1976, Table 22 m) Using the Frith (1971 a) 
SSQ, subjects were further divided into two types of smoker, (1) low-arousal smokers, with a greater 
average desire to smoke in situations inducive of low arousal than in high arousal situations, and (ii) 
high-arousal smokers, with greater desire to smoke in high-arousal rather than low-arousal situations. 
The design was completed and replicated with 24 subjects ın each case. 
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Subjects 


The subjects were 48 male students or employees at the University of Sussex, selected by daily 
consumption and SSQ scores to form the six smoker groups defined by the design. Average weight 
was 67-1 kg (range 54—88 kg) and average age 25:5 years (range 20-46 years). All were habitual 
cigarette smokers who had smoked at their current rate for at least 1 year and characteristically 
inhaled the smoke fully or partly into the chest. Those who inhaled only to the back of the throat 
were excluded. Modal strength of preferred cigarette brands was ' middle tar' for each group. Subjects 
were paid £5 for their participation over 4 days and were informed in advance that the study was of 
the effects of the first cigarette of the day. 


Materials 


Cigarettes. Treatment levels were (i) sham smoking of an unlit cigarette, and actual smoking of (ii) a 
mild cigarette delivering 0-6 mg nicotine and 7 mg 'tar' (particulate matter, water and nicotine free) 
when machine smoked to UK standard parameters, (ii) a medium strength cigarette (1-3 mg nicotine, 
19 mg tar), and (iv) a strong cigarette (1-8 mg nicotine, 27 mg tar) All cigarettes were approximately 
83 mm long, filter-tipped, contained only Virginia tobacco, and were visually indistinguishable. 


Questionnaires. Subjects completed (a) a questionnaire on smoking habits, (b) the Eysenck & Eysenck 
(1975) Eysenck Personality Questionnaire (EPQ), (c) the Frith (1971a) SSQ. 


Procedure 


Each subject each day had undertaken to abstain from all tobacco products, tea, coffee, cocoa, 
alcohol and all other psychoactive substances since at least the previous midnight. Otherwise, subjects 
were asked to take their normal breakfast, to have had a normal night's sleep and to be in good 
health. Subjects were tested in groups of 3 to 12 in the same quiet, comfortably furnished room. Each 
attended at 9.30 a.m. on 4 days always lighting up his experimental cigarette (or sham smoking) at 
10.00 a.m. +5 minutes. Cigarette treatment level per day to each subject within each smoker 
subgroup was determined by a 4 x 4 balanced Latin square which was changed for the design 
replication. All sessions followed the same procedural order: letter cancellation test, an immediate 
memory test, paced cigarette smoking, an immediate memory test and a final letter cancellation test. 
All testing was completed within 15 minutes after termination of smoking. Sessions were run by a 
female research assistant blind to cigarette nicotine content. For about 5 min immediately before and 
after smoking subjects completed an eclectic mood schedule, reported separately (Williams, 1978) 


Letter cancellation task. Subjects were required to work as fast as possible without errors crossing out 
each instance of an ‘E’ found in sheets of randomly ordered letters arranged in lines of 30 letters. The 
score was the total number of letters scanned in 3 min. Errors were not recorded. 


Immediate memory task. Six pre-recorded sequences of nine random digits with the constraint of no 
adjacent repetitions were read out at one digit per second. Each sequence was preceded by a warning 
auditory click and followed by a click signalling the start of an 11 s recall period. During recall, 
subjects wrote as many digits as could be remembered in appropriate boxes on printed record sheets 
Only the right digit in the right box counted as correct and the score was total errors made over six 
sequences. Different sequence sets were used for each test and each session. Subjects were advised 
that this was considered to be a relatively difficult task, asked just to do their best, and given one 
practice sequence on first testing. 


Paced smoking. Subjects were reminded how to inhale and given a demonstration. Smoking rate was 
paced by timed lights. A ‘ready’ light warned smokers to rarse their cigarette to their lips and was 
followed by a ‘go’ light which appeared for 2 s every 30 s. Subjects were instructed to steadily fill 
their mouths with smoke throughout the display of ‘go’ and on its termination to take back the 
smoke into their chest followed fairly quickly by exhalation. Smokers lit their cigarettes without 
inhaling and the timer started. Smoking continued to a line drawn at filter cork overwrap plus 5 mm 
Sham smokers terminated with the first actual smoker. The number of puffs taken by each smoker 
was recorded and stubs were checked for satisfactory completion. 

Subjects were informed that the rate of smoking had been determined from previous studies of the 
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natural rate of smokers when relaxing quietly without task demands (Ashton & Watson, 1970; Fuller 
& Forrest, 1973). While twice the rate employed in the UK Government Chemist’s standard 
procedure, Agué’s (1973, Table 1) figures indicate that even so nicotine delivery rarely reaches 
machine delivery levels although rank order between cigarette brands is preserved. 


Results 

Comparability of smoker groups. The selection procedure established six smoker groups 
differing in daily consumption and type of preferred smoking situation but on no other 
measured variable of theoretical significance. All questionnaire scales and background data 
were treated to univariate 3 (degree) x 2 (type) factorial analyses of variance. Neither main 
nor interaction effects were significant for body weight, characteristic depth of inhalation, 
EPQ extraversion, neuroticism and lie scores. The degree x type interaction for EPQ 
psychoticism was significant (F = 3-46, d.f. = 2, 42, P < 0-04) but no comparisons between 
group means were significant by the Newman-Keuls test. The degree main effect was 
significant for age (F = 4-19, d.f. = 2, 42, P < 0:03); heavy smokers were older than 
medium (P < 0-05) and light (P < 0-05) smokers. 

[he independence of the degree of smoking and type of smoker classifications was 
confirmed by an insignificant main effect of degree on Frith arousal-ratio scores (F = 1-25, 
d.f. = 2, 42, P = 0:3). Average daily consumption by light, medium and heavy smokers was 
8-8, 19-7 and 34-4 cigarettes, respectively; by low and high arousal smokers, 20-5 and 21-4, 
respectively. 


Smoking behaviour. Number of puffs per cigarette were given a univariate 3 (degree) x 2 
(type) x 3 (smoked cigarettes) factorial analysis of variance (repeated measures on 
cigarettes) replacing four missing puff values by means within smoker group. Of all effects 
only the cigarette main effect was significant (F = 8:8, d.f. = 2, 84, P < 0:001). Fewer puffs 
were needed to finish the medium cigarette than the mild (P « 0-01) and strong (P « 0-01) 
cigarettes. Manufacturer's machine smoking data suggest this difference was intrinsic to the 
cigarettes. No other comparisons were significant. Total-sample means for mild, medium 
and strong cigarettes were 15-75, 15-08 and 15-87 puffs, respectively. 

The average number of puffs taken by individual subjects ranged between 13 and 18:67 
indicating differences between subjects in puff volume. Furthermore, interviews with 
selected subjects confirmed that some self-titrated smoke effects by adjusting inhalation 
depth and duration. Some subjects clearly conformed to both the spirit and the letter of 
procedural instructions, others self-titrated via uncontrolled smoking parameters apparently 
as an individual function of perceived cigarette strength and taste acceptability. 


Letter cancellation task. Pre-smoking letter cancellation scores were subtracted from 
post-smoking scores and resultant gains given a 3 (degree) x 2 (type) x 4 (cigarettes) 
factorial analysis of variance (repeated measures on cigarettes). Two main effects, but 
nothing else, were significant, type of smoker (F — 9-15, d.f. — 1, 42, P « 0-004) and 
cigarettes (F = 7-96, d.f. = 3,126, P < 0-001). Mean gains per cigarette for all subjects 
combined are reported in Table 1. By the Newman-Keuls test, gain on the sham cigarette 
was lower than on each of the actually smoked cigarettes (P « 0-01 in each case). No other 
comparisons were significant. 

Mean gains for low- and high-arousal smokers separately are also reported in Table 1. 
Upon smoking, low-arousal smokers gained an average 9-8 per cent of pre-smoking scores; 
high-arousal smokers gained an average 4-5 per cent. The two groups differed significantly 
in gain on the mild (P « 0-05) and medium (P « 0:05) cigarettes. 

A monotonic trend test for correlated samples (Ferguson, 1976) was significant for trend 
across cigarette levels for the total sample (z = 2:16, P < 0-05), and for low-arousal 
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Table 1. Average gains in letters cancelled with different cigarettes smoked 








Cigarettes 
Group Sham Mild Medium Strong 
All subjects? 25:3 129-5 161-5 118-7 . 
Low-arousal? 29-5 198-3 223-2 147-0 
High-arousal^ 211 60-6 99-8 90:5 


a n = 48; SE per mean = 20:7. 
* n = 24; SE per mean = 29:3. 


smokers (z = 2:29, P < 0:05), but not for high-arousal smokers (z = 0:69, n.s.). Olshen's 
(1967) extension to the sign test can be used in this case effectively to test for an inverted-U 
trend across cigarettes. For the total sample, z = 1-53, P (one-tail) = 0-063; for high- 
arousal smokers, z = 0-14, n.s.; for low-arousal smokers, z = 2:17, P « 0-05. 

Regressing gains upon pre-smoking scores for an analysis of covariance made no 
essential difference to the significances and pattern of results. With such analysis, high- 
arousal smokers showed optimal gain on the strong rather than medium cigarette; 
covariate-adjusted gains being 89-1 and 83-5, respectively. 


Immediate memory. Increases in errors in immediate recall of digit sequences were analysed 
in the same way as letter cancellation gains. No main or interaction effects were significant. 
However, inspection of individual scores suggested considerable differences between 
subjects in efficacy of memory strategies brought to the task. Absolute gain scores were 
misleading therefore since more account needed to be taken of pre-smoking performance in 
these naive and relatively unpractised subjects. Accordingly, gain scores were regressed on 
pre-smoking scores and in a 3 (degree) x 2 (type) x 4 (cigarettes) factorial analysis of 
covariance (repeated measures on cigarettes) the cigarette main effect, conditional on initial 
performance, was significant (F = 3-68, d.f. = 3, 125, P < 0-02). All other effects remained 
insignificant. Covariate-adjusted mean gains in errors per cigarette for the total sample are 
reported in Table 2. Adjusted error gain in the sham smoking condition was lower than 
for the medium (P « 0-05) and strong (P « 0:05) cigarettes. No other comparisons were 
significant. The monotonic trend test for correlated samples was significant for trend across 
cigarette Jevels for the total sample (z — 2-21, P « 0-05). To facilitate discussion adjusted 
gains for high- and low-arousal smokers separately are also reported in Table 2. 


Table 2. Covariate-adjusted average gains in immediate-memory errors with different 
cigarettes smoked (pre-smoking scores as covariate) 








Cigarettes 
Group Sham Mild Medium Strong 
All subjects? —0-54 0:31 1:78 1-98 
Low-arousal^ —0-34 1-32 1-43 3-18 
High-arousal? —0-75 —0:71 2:12 0-78 








? n= 48; SE per mean = 0-62. 
* n = 24; SE per mean = 0-87. 
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Discussion 

Although residual self-titration will have reduced the contrast between smoked cigarettes, 
nevertheless cigarettes appear to have had the predicted effect upon the two performance 
tasks. While the relationship between letter cancellation improvement and cigarette 
strength increased monotonically in all smokers combined and in low-arousal smokers 
taken separately nevertheless the degree of curvature was sufficient to establish a significant 
inverted-U trend in gains for the low-arousal smokers and a borderline result for all 
smokers combined; the curve for high-arousal smokers being relatively too shallow for 
significance. An inverted-U relationship is in accordance with the theoretical prediction by 
Blake (1971) after Corcoran (1962) although it is clear that rather high levels of nicotine 
would be needed before performance actually deteriorated. On the other hand, the 
monotonic increasing relationship between cigarette strength and memory deficiency is in 
apparent agreement with the linear relationship between tonic arousal level and short-term 
memory for items found by Baddeley et al. (1970) and Hockey et al. (1972), although 
through use of only two levels of arousal (times of day) neither of these latter studies 
could effectively test for a curvilinear relationship. However, note that in the current study 
subjects were coincidentally tested very close to that time of day for which Blake (1971) 
found optimal performance for digit span. Thus these results do not completely rule out the 
possibility of an inverted-U relationship between cigarette-produced arousal and immediate 
memory (cf. Berlyne et al., 1965), and such a relationship might be found were subjects to be 
required to smoke the day's first cigarette at say 8.00 a.m. when presumably they would be 
at a suboptimal level of arousal prior to smoking. 

The failure of the degree of smoking classification unconfounded by type of smoker 
differences to relate to performance change is surprising but not unreasonable. The plasma 
half-life of nicotine is under 30 min (Isaac & Rand, 1972) and all smokers abstained for at 
least 10 hours before smoking. Apparently all smokers of a given type were thereby 
brought to a common base-line of nicotine sensitivity. There was, therefore, no 
between-day tolerance to nicotine effects on performance caused by level of consumption. 
This was in marked contrast to differences between types of smoker. 

While the considerable variation within smoker groups in efficacy of strategies brought 
to the immediate memory task was probably responsible for obscuring any type-of-smoker 
effect, the standard errors of means for adjusted gains on each cigarette were notably large, 
nevertheless high-arousal smokers on average gained fewer errors than low-arousal smokers 
for three out of the four cigarette levels; a relative position compatible with their poorer 
speed gains in letter cancellation. 

The differential gains in letter cancellation by high- and low-arousal smokers are of 
considerable interest. They may have represented no more than differential self-titration 
between the two types, but there were no group differences in either number of puffs taken 
per cigarette or self-reported characteristic depth of inhalation. Nor did differences 
represent some initial-value effect since low-arousal smokers before smoking were on 
average marginally but not significantly faster than high-arousal smokers. A more 
interesting possibility is that these high-arousal smokers were less sensitive to the 
psychopharmacological effects of nicotine, a difference which would explain how they could 
continue inhaling an arousing, specifically stimulating, drug in stimulating conditions. 
However, Myrsten et al. (1975) found that, in terms of heart rate, their high-arousal 
smokers were physiologically more reactive to cigarettes in both their test situations 
compared to low-arousal smokers. Further reconciliation between the Swedish study and 
the current results is difficult to achieve due to differences in experimental procedures and 
selection methods. Nevertheless, both studies lead to the same conclusion that there are 
important contrasts to be found in smoking effects in different smoker types and that such 
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distinctions between smokers should be included in future research. The way ahead may be 
indicated by Schalling's (1977) report, not available in time for this study, of attempts to 
distinguish preferred smoking in different kinds of low arousal situation, such as relaxing 
versus monotonous and boring, and different kinds of high arousal situations, such as 
those involving anticipation versus vague distress. 

In conclusion, given the discrepancies between the present study and those of Andersson 
(1975 a) and Andersson & Hockey (1977) more research on cigarette effects is needed using 
different materials for short-term recall, contrasting digit strings, word lists, paired 
associates and so on, contrasting incidental and intentional memory, and comparing 
short-term with long-term recall. Andersson & Post (1974) and Andersson (19755) using 
verbal rote learning found initial deficits in performance after smoking but better 
performance subsequently, suggesting improved delayed recall. However, studies of 
cigarette effects on eventual long-term retrieval need careful design to control for the 
residual effect of nicotine in enhancing mental endurance (Frankenhaeuser et al., 1971; 
Myrsten et al., 1972; Hartley, 1973), particularly relevant where repeated learning trials 
are used over prolonged test sessions and the measure of long-term memory is performance 
at the end of a session of less than 90 min following a cigarette (cf. Frankenhaeuser et al., 


1970). 
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The effect of three instructional sets on the recall of story-like material 
I. M. Cornish 








An earlier study of prose recall (Cornish, 1978) was extended to look at the effects of giving subjects 
different recall instructions; these were ‘normal’, or emphasized accuracy or quantity of recall. 
Instructional differences affected the non-verbatim and intrusive recall components, but not verbatim 
recall. Previous results were therefore unlikely to have been influenced by instructional ambiguities. 
"Together, the two studies go some way towards separating the effects of acquisition and recall 
processes in reproduced prose material. 





Cornish (1978) analysed recalled prose into three components: ‘verbatim’, ‘non-verbatim’, 
and 'intrusive', the actual quantities being designated V, X and I. The total number of 
words in a reproduction, W, was the sum of V, X and I, and this scheme, though crude, 
proved satisfactory for immediate recall. The main finding was that most of the variations 
in recall, whether among subjects, passages or order of presentation, were due to variations 
in V. With a few exceptions, X and I remained more or less constant. 

The present study extends the previous one by examining the effects of varying recall 
instructions, using the same analytic scheme. It was thought possible that the earlier 
instructions, phrased to balance accuracy and quantity of recall, may have been interpreted 
differently by different subjects or at different times. This could be checked by introducing 
instructions which stressed each of these two extremes. Also, manipulating how subjects 
recall prose material might help distinguish effects arising during recall from those 
produced during ‘storage’ or initial acquisition. 

Gauld & Stephenson (1967) found that instructions which emphasized accuracy 
produced a significant decrease in the number of ‘errors’ in subsequent story 
reproductions. The present study goes further by adding a third set of instructions which 
stress quantity of recall, and by employing broader, more detailed analysis. 


Method 
Subjects 


Thirty-six students (15 male, 21 female), mostly undergraduates and from a variety of disciplines, 
acted as subjects. All were volunteers, paid 20 pence each for the session, and none had any prior 
experience of such an experiment. 


Passages 


Three story-like passages (1A, 2C and 3B) were selected from a set of nine specially written for the 
earlier study, and represent between them each of the three types of form and content used in their 
construction. None had previously shown extreme or atypical results. 


Design 


Each subject received all three passages in a single session. Within each of the three instructional 
conditions, each of the six possible sequences of passages occurred twice. Subjects were randomly 
allocated to passage sequences and instructional conditions. There were thus 12 subjects in each 
condition, each subject recalling under the same condition throughout. 


Instructions ; 


The instructions used here were a modification of the earlier ones. Only recall instructions differed for 
the three groups of subjects. ‘Normal’ recall instructions (N), similar to previous ones, were written 
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to strike a balance between demanding accuracy of fact or phrasing, and permitting the reproduction 
of material regardless of likelihood. ‘Precise’ (P) and ‘liberal’ (L) instructions emphasized accuracy 
and quantity of recall respectively as the primary requirement. The three sets of recall instructions are 
given in the Appendix. 


Procedure 


Admunistration of the experiment followed closely the procedure given in Cornish (1978), except that 
subjects read each passage once instead of twice. This was an attempt to reduce any tendency to 
self-correction, perhaps influenced by a knowledge of the recall requirements, during a second 
reading. After presentation of each passage, recall instructions were given, followed by the subject’s 
attempt to recall the passage. Only when this was completed were subjects given the next passage to 
read. 

Debriefing after the experiment included mention of the different types of instruction being used, 
and stressed that any future experiment would use only ‘normal’ recall instructions. Subjects under P 
and L conditions were shown how N instructions differed from their own. 


Results 


Analysis into ‘word scores’ used the scheme described in Cornish (1978). Means of W, V, 
X and I are presented in Table 1, together with ANOVA results. Differences in recall 
instructions had no effect on the verbatim recall component, but quite large effects on the 
non-verbatim and intrusive components. Order of presentation, 1n contrast, influenced 
only the verbatim component, in agreement with the earlier study. There were only small 
differences among the passages, unlike previous results, but as the passages had been 
chosen to be fairly similar in this respect, the finding is not important. The marginally 
significant difference in intrusions among passages is of little consequence in the context of 
a dozen tests of significance. Comparison with the results of the earlier study, difficult 
because of design differences, suggested that reading passages once instead of twice had 
depressed W somewhat (by an average of 16 words), principally by decreasing V and I. 


Table 1. Word-score means and ANOVA results for instructions, passages and order of 
presentation 








W V X I 

Instructions (n — 36) 

Liberal 156-6 73:3 62-4 20-5 

Normal 136-5 71-2 52:3 13-3 

Precise 129-5 70-3 47.5 11-7 

(F =) (6-26**) (« 1) (10-55***) (10-32***) 
Passages (n = 36) 

1A 140-3 73-3 52:4 14:1 

2C 150-3 73:9 58.5 18-2 

3B 131-9 67-7 51:3 13:2 

(F=) (2:68) (<1) (2:72) (3:25*) 
Order of presentation 

(n = 36) 

Ist 120-9 54.1 51:2 ‘161 

2nd 144-3 75:1 55-4 14-6 

3rd 157-2 85-7 55-6 14-8 

(F =) (10-81***) (16:67***) (1-11) (<1) - 
Overall (n = 108) 

Means 140-8 71-6 54-1 15-2 

SD 38-7 26-7 155 9-7 





* P < 0-05; ** P < 0-01; *** P < 0-001. 
d.f. = 2, 101. 
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Discussion 


The present experiment showed that varying the instructions for recalling prose produced 
variations in the intrusive and non-verbatim recall components, but not in the verbatim 
component. This pattern of results, the opposite of what Cornish (1978) found, makes it 
unlikely that differences among subjects’ interpretations of the experiment can have 
influenced the earlier findings. 

The above results agree with those of Gauld & Stephenson (1967) in so far as the greater 
the accuracy demanded, the fewer were the errors (i.e. X and I) produced. It might have 
been thought that V would increase with stricter instructions, but no such effect was 
observed. Subjects behave rather as if they could operate a variable criterion between what 
is recalled per se, and what is passed for overt reproduction, but are unable to improve on 
the ‘quality’ of what is recalled in the first place. 

One simple explanation for the difference between this and the earlier experiment is that 
the factors explored by Cornish (1978) — order of presentation, passage and subject 
differences — act mainly during the initial acquisition and storage of material, whereas recall 
instructions affect only recall (anticipatory effects elsewhere might have been expected). 
Such an experimental separation of the processes underlying prose memory might be used 
to clarify other work in the area. 

Bartlett (1932) first described the constructive character of memory, considering that 
‘constructive’ changes emerged during recall, but since his studies of perception also 
showed such changes, they may have occurred during presentation too. Kay (1955) and 
Gomulicki (1956) preferred to describe prose recall as ‘abstractive’ because it showed the 
products of selective processes operating during presentation. Kay did point out, however, 
that the constructive nature of recall increased with longer recall delays, so that perhaps a 
majority of constructive changes still occur after acquisition. Frederiksen (1975) found 
‘over generalized and inferred’ information in subjects’ reproductions of text and ascribed 
them to the processes of acquisition rather than recall. Extending the present work might 
give us a model for prose recall which for the first time distinguishes and describes clearly 
the processes of acquisition and recall. 

Finally, the intermediate stage ('storage") must not be forgotten. The failure of verbatim 
recall to be influenced by recall instructions suggests that it may be a feature of how 
material is represented in memory. Further light may be thrown on this possibility by 
qualitative analysis, or by observing the effects of recall delay (which might affect the 
memory representation directly). Reports of such studies are in preparation. 
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Appendix 
The three sets of recall instructions 
Precise set (P) 


"Now, I want you to write out as much of the passage as you can remember, in prose form. I am 
interested principally in accuracy of recall, so you must take particular care over the details and 
wording of what you write down. Don't write down any details until you are reasonably sure of their 
accuracy and correct sequence. Wherever possible, you should try to use the wording of the original 
passage. Take your time over this. There's no need to hurry, and rushing may cause you to omit 
details you would otherwise remember, or make errors of fact or phrasing. Check your account 
through carefully when you have finished, making corrections, additions or footnotes as you wish. 
Spelling and punctuation don't matter. Are there any questions? Right, begin when you are ready. 
Remember, take your time, and it's accuracy that counts.' 


Normal set (N) 


‘Now, I want you to write out as much of the passage as you can remember, in prose form. I am not 
interested in the exact words used originally, but if you do happen to remember them, so much the 
better. Take your time over this, there's no need to hurry. If there is anything you remember that you 
are not sure about, underline it in your account. Check through what you have written when you 
have finished, making corrections, additions or footnotes as you wish. Spelling and punctuation don't 
matter. Are there any questions? Right, begin when you are ready.’ 


Liberal set (L) 


‘Now, I want you to write out as much of the passage as you can remember, in prose form. I am 
interested principally in how much you can remember, even if what you recall is not particularly 
accurate, although accuracy should still be a subsidiary consideration. I am not interested in the _ 
exact words used originally, but if you do happen to remember them, so much the better. If you think 
there is a gap in your memory, i.e. a word or phrase or section missing, try to put something in, even 
if it means having an educated guess. Similarly, it is always better to put something down you are not 
sure about than to leave it out altogether. Take your time over this, there's no need to hurry. Check 
through what you have written when you finish, making corrections, additions or footnotes as you 
wish. Spelling and punctuation don't matter. Are there any questions? Right, begin when you are 
ready.’ 
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A note on ‘Time of day effects in school children’s immediate and delayed 
recall of meaningful material’ — the influence of the importance of the 
information tested 


Simon Folkard 


In an earlier paper, Folkard et al. (1977), it was reported that children’s immediate 
memory for the information presented in a story was higher at 09.00 than at 15.00, but that 
delayed (1 week) retention was superior following presentation at 15.00. This note reports a 
reanalysis of that data that examined the influence of the importance of the information 
tapped by the individual questions. 

Folkard et al. (1977) interpreted the interaction found between time of presentation and 
delay of recall as reflecting an increase in arousal level over most of the waking day 
(Colquhoun, 1971). There is considerable evidence (reviewed by Craik & Blankstein, 1975) 
that arousal at presentation benefits delayed retention, but the effects on immediate 
memory would appear to be inconsistent. A possible explanation of these effects stems from 
the finding that high arousal biases attention to the more dominant or important sources of 
information (Broadbent, 1971), in that such a bias may affect immediate and delayed 
retention scores differentially. In view of this, the results of Folkard er al. (1977) were 
reanalysed to examine the influence of importance. 

Six undergraduate students were given a transcription of the story and a correctly 
completed questionnaire. They were asked to rate the importance of the information 
tapped by each question to an understanding of the story, using 10 cm visual analogue 
scales. There was a highly significant degree of concordance in these ratings (Kendall’s 
W = 0:58, P < 0:001), and the mean rating given to each question was therefore taken as 
an index of its importance. 

These ratings of importance were then related to the proportion of the children correctly 
answering each question in the immediate and 1 week delayed conditions, following 
presentation at 09.00 or 15.00. The main findings are shown in Fig. 1 in which the 20 
questions have been grouped into five sets of four questions on the basis of their 
importance ratings. Importance had a far greater effect on immediate recall following 
presentation at 15.00 (r = 0-43, P < 0-05) than at 09.00 (r = 0-20, P > 0-25). Thus the 
superiority of immediate memory at 09.00 was greatest for the least important items, and 
non-existent for the most important ones. In contrast, the proportion of children correctly 
answering a question in the | week delayed condition was significantly correlated with 
importance following the original presentation at both 09.00 (r = 0-48, P « 0-05) and 15.00 
(r = 0-39, P < 0-05). This suggests that the superior delayed retention following 15.00 
presentation was relatively constant over the different levels of importance. 

Analyses of variance, in which the 10 least important questions were compared with the 
10 most important ones, confirmed that there was a significant interaction in the immediate 
memory scores between time of day and importance (F — 6:93, d.f. — 1, 40, P « 0-05). 
However, there was no such interaction in the delayed scores (F « 1), confirming that the 
superior delayed retention following presentation at 15.00 was indeed unaffected by 
importance. Further analyses examined the difference between the immediate and delayed 
retention scores (i.e. amount forgotten) as a function of time of presentation and item 
importance. These indicated that importance had no influence on the amount forgotten 
following presentation at 15.00 (F « 1), while presentation at 09.00 resulted in greater 
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Figure 1. The probability of correct recall (corrected for guessing) as a function of importance, in the 
immediate (left-hand panel) and delayed (right-hand panel) tests following presentation at 09.00 or 
15.00. 


forgetting of the unimportant items than the important ones (F — 7:51, d.f. — 1, 62, 
P « 0:01). 

The immediate memory results are consistent with the suggestion that arousal biases 
attention to more dominant or important information. They also offer an explanation as to 
why the effects of arousal on immediate memory are inconsistent, since it is clear from Fig. 
1 that the magnitude and direction of the observed effect of arousal on immediate memory 
may depend on the importance of the information tested. However, it is difficult to see how 
such a bias in attention under high arousal can account for the disappearance of this 
interaction in the delayed retention scores. One possibility is that in the morning subjects 
engage in more maintenance processing that takes no account of the meaning of the 
information and is thus uninfluenced by its importance. Such processing may enhance 
immediate, but not delayed, retention. It may also reduce the capacity for, or interfere 
with, the more elaborative processing that determines delayed retention. Since elaborative 
processing is based on the meaning of information, it is clearly capable of being affected by 
its importance. This suggestion can account for both the interaction between time of day of 
presentation and delay of recall, and the differential effect of importance on immediate and 
delayed retention following 09.00 presentation. 

Clearly further research is needed to elucidate the nature of the changes in 
information-processing responsible for these effects. However, the present results do suggest 
that the only advantage to be gained from confining the teaching of academic subjects to 
the morning will be the superior immediate retention of unimportant information. 
Afternoon presentation would appear to result in superior delayed retention of both 
important and unimportant information. While again further research is needed on this 
topic, it would appear that the current practice of teaching most academic material in the 
morning may well be ill-founded. 
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When does context influence recognition memory? 


Duncan Godden and Alan Baddeley 





Hewitt (1977) has distinguished intrinsic context, which is directly involved in the encoding of 
material and extrinsic context comprising such arbitrary features of the learning situation as 
environment of learning. While both types of context influence recall, with better performance when 
the original context 1s reinstated, recognition effects have been observed only with intrinsic context. 
The present study uses the contrast between the land and underwater environments to explore this 
apparent discrepancy. Subjects learned lists of 36 words either on land or under water, and 
subsequently tried to recognize them from a list of 72 words presented in either the same or the 
alternative environment. In contrast to an earlier recall study, no trace of context dependency was 
Observed. Implications for the distinction between intrinsic and extrinsic context are discussed. 





There has in recent years been a great deal of research into the role of contextual cues in 
memory. Perhaps the most clearly delineated position is that of Tulving (Tulving & Osler, 
I968; Tulving & Thomson, 1973), who has proposed what he terms the encoding specificity 
principle. This assumes that a retrieval cue will be effective in prompting recall or 
recognition if, and only if, it was encoded with the relevant item during learning. Most 
studies on this issue have manipulated semantic context; for example Light & Carter-Sobell 
(1970) used words with more than one meaning (e.g. JAM) and showed that presenting 
such a polysemous word in one semantic context (JAM, strawberry), and testing in another 
(JAM, traffic) led to very poor performance. Other studies have shown that more subtle 
changes of semantic context may impair recall or recognition (Tulving & Osler, 1968; 
Barclay et al., 1974). 

In a recent unpublished review of the literature on contextual effects in memory, Hewitt 
(1977) has drawn a distinction between intrinsic and extrinsic context. The term intrinsic 
context refers to aspects of a stimulus which are inevitably processed when the stimulus is 
perceived and comprehended. Examples would be the type of lettering in which a word is 
written, the voice in which an item is spoken, and typically the semantic characteristics of a 
word and its semantic context. Extrinsic context refers to those characteristics of the 
stimulus situation which are irrelevant to the processing of the stimulus itself; the colour of 
the walls of the room in which an experiment is carried out would be an example of an 
extrinsic contextual cue. While such cues have a less reliable influence on retention than 
intrinsic contextual ones, there is clear evidence that material learned in one environment is 
better recalled than in an alternative very different setting. 

In a previous paper (Godden & Baddeley, 1975) we showed that the underwater 
environment allows a particularly clear demonstration of extrinsic context dependency, 
with divers who learned and recalled under water, or learned and recalled on dry land, 
remembering 46 per cent more than divers who learned in one environment and recalled in 
the other. Although the magnitude of the context dependency effect tends to be 
considerably smaller under the less dramatic manipulations of environment that are 
possible on dry land, it nevertheless presents a phenomenon of general interest from both a 
practical and a theoretical viewpoint. 

There is reason to suspect that extrinsic and intrinsic contextual effects may differ in one 
important respect. Whereas intrinsic context has a powerful effect on both recall and 
recognition (e.g. Light & Carter-Sobell, 1970; Marcel & Steel, 1973; Watkins et al., 1976) 
all the positive evidence for an influence of extrinsic context comes from studies using 
recall, while the few recognition studies that have been performed appear to have produced 
uniformly negative results. 
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Baddeley et al. (1975) had divers learn passages of prose under water in either warm 
or cold conditions. On resurfacing, recall was required, followed by a task involving 
recognition of statements taken from the passages. Level of recall was considerably poorer 
than performance on a prior dry practice run, whereas recognition showed no comparable 
decrement. However, since the study was not set up to study context-dependency, the 
relevant controls were not included; nor was the dry practice passage strictly comparable 
with the underwater test passages. Davis et al. (1975) also obtained results suggesting that 
context dependency may be much stronger for recall than recognition. They tested divers 
three times on a task involving the recall and recognition of unrelated words. The first test 
was a practice run carried out on land, while the second and third were underwater tests in 
warm or cold water. In the recall test, subjects’ performance dropped dramatically from a 
mean of 9-3 words correct on land to 4-8 in warm conditions and 3-0 in cold water. In the 
case of recognition, however, performance did not differ between dry land, with 23-8 words 
being recognized out of 30, and warm water where 22-7 words out of 30 were correctly 
recognized, although there was a small but significant effect of cold on performance. Once 
again however the experiment was not set up to study context dependency, and order of 
presentation is clearly a confounding factor. 

A third source of consistent but inconclusive evidence comes from experiments on 
state-dependent memory, in which the subject's internal state is changed by means of a 
drug. A study by Goodwin et al. (1969) demonstrated clear state-dependency effects on a 
range of recall tasks when subjects were required to learn under the influence of alcohol 
and recall either drunk or sober. However, no context-dependent effect was found in a task 
involving the recognition of pictorial material. A study by Wickelgren (1975) again showed 
an effect of alcohol on learning, but no evidence of state dependency when performance 
was tested by recognition. Hence, although the available evidence is fragmentary and 
suggestive rather than conclusive, it does suggest that extrinsic context dependency effects 
may be avoided if memory is tested by recognition rather than by recall. 

If one is predicting a negative result, it is clearly necessary to choose a situation in which 
powerful context dependency effects are known to occur. We therefore again opted to use 
the underwater environment which has been found to generate a much stronger effect than 
is typically obtained on land (Godden & Baddeley, 1975). Divers learned lists of words in 
both dry (D) and wet (W) conditions and subsequently were required to recognize them 
from a list containing an equal number of filler words, in either the same or the alternative 
environment. All divers performed under all possible conditions; DD (learn dry, recognize 
dry), DW, WW and WD. 


Method 
Subjects 


Sixteen subjects, 12 male and four female members of a university diving club, were tested. 


Apparatus 


Five lists of words were constructed and subsequently recorded on tape (see Procedure). Each list 
consisted of 36 unrelated, two- or three-syllable words chosen at random from the Toronto word 
bank. The words were presented via a Diver Underwater Communication (DUC) set. This consists of 
a surface-to-diver telephone cable, terminating in a bone transducer, which, placed on the diver's 
mastoid, enables both surface-to-diver and diver-to-surface communication. The DUC set was 
modified such that taped material, monitored by the surface operator, could be presented directly to 
the subject using a cassette tape-recorder. A twin transducer on the set allowed two subjects to be 
tested during the same period. Weighted Formica boards, sealed with transparent Fablon, enabled 
subjects to record responses in pencil both on land and underwater. Subjects used standard SCUBA 
breathing apparatus and diving equipment of various designs dictated by personal preference. 
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Procedure 


All instructions and stimuli for cach experimental session were recorded on tape. Efficient auditory 
perception of stimuli by a submerged diver using SCUBA apparatus is seriously impaired by the 
noise of his breathing. The presentation of the material was therefore grouped, so as to allow the 
diver to adopt a comfortable breathing rate which did not interfere with his auditory perception. 
Thus, each list was presented in blocks of three words. Within each block, the words were spaced at 
2 s intervals. Between each block, a 4 s interval enabled subjects to exhale, inhale and hold their 
breath in readiness for the presentation of the next block, and so on. 

Each tape began with an explanation of this breathing procedure, followed by a ‘breathing pattern’ 
section, to ensure that subjects were breathing correctly and in rhythm before the first word of the list 
appeared. This section consisted of nine spoken presentations of the letter z, in three blocks of three, 
and with identical spacings to those of the words in the list itself. Immediately after each block of zs, 
subjects heard the command 'breathe'. The presentation of the word list followed on naturally in 
rhythm, and the command to breathe was then dropped. 

On each tape (one for each of the 16 condition/list combinations), the relevant list was presented 
twice. Between the first and second presentations, a gap of 10 s allowed subjects a short rest with 
unconstrained breathing. The second presentation was again preceded by the breathing pattern 
section. 

'To eliminate the complication of possible primary memory effects (Glanzer & Cunitz, 1966), the 
second presentation of each list was followed by 15 digits which subjects were required to copy at a 
rate of 2 seconds per digit. This was followed by the next instruction (e.g. ' Ascend to the shore 
station"), and a 4 min delay. This delay occurred in all conditions and was necessary to enable 
subjects to comply safely with the relevant instruction. They were then required to recognize as many 
of the words as they could from a list containing the target words mixed randomly with an equal 
number of comparable filler words, also taken from the Toronto word bank. These were presented 
using typed lists attached to weighted Formica boards, which were water sealed. Responses were made 
by marking strips of PVC adhesive tape, attached to the boards alongside the lists, with a horizontal 
dash if a word was recognized, and a zero if it was not. They were instructed to work through the list 
serially, once only, making their decision about each word as it occurred. This took, on average, 
about 2 min. 

The original 16 subjects were split at random into four groups of four. Prior to the first 
experimental session, all subjects underwent a practice session, comfortably seated around a table. 
Dunng this they first practised the breathing technique, then the task itself, using a practice list. 

Pairings of the remaining four lists L,...L, with the four conditions, and the temporal orderings 
of the conditions for each of the four groups, were arranged according to a Graeco-Latin square 
design. Subjects experienced one condition per diving session, and the sessions were separated by 
approximately 24 hours. The design was such that each group experienced conditions and lists in 
different orders, that a given condition/list pair was never administered to more than one group, and 
that lists and conditions had equal representations on each experimental session 

Subjects in environment D (dry) sat by the edge of the water, masks tipped back, breathing tubes 
removed, and receivers in place. In environment W (wet), subjects dived to approximately 5 m, taking 
with them their Formica board and two pencils, and with their receivers in position. Subjects sat on 
the bottom with one arm looped round a heavy chain, and the session began after a verbal signal to 
the surface operator signified their readiness. 

To control for the possibility of a ‘disruption’ effect, resulting from different amounts of activity 
between presentation and test in the different conditions (Strand, 1970), the following procedure was 
adopted. In the WW condition, subjects, after performing the digit copying task, surfaced to collect 
their response boards, then returned to their original submerged position to perform the recognition. 
In the DD condition, subjects were required to enter the water, dive, and return to the surface 
between digit copying and recognition. In addition, subjects learning ‘dry’ were required to get 
thoroughly wet and cold before the session began, to control for possible differential effects of cold. 
Testing took place at an open freshwater site near Cambridge, England. 
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Table 1. Recognition performance as a functicn of learning and test environment 


Test environment 








Learning environment Dry Wet 

Dry 
Hit rate 0-76 0-75 
False positive rate 0:17 0-18 
d' 1:87 1-89 
Recall probability (0-36) (0-24) 

(previous study) 

Wet 
Hit rate 0-70 0-68 
False positive rate 0-19 0-21 
d' 1:63 1-44 
Recall probability (0-23) (0-32) 





Figures in parentheses show recall probabilities from Godden & Baddeley (1975). 


Results 


Mean probability of correct recognition and positive probability are shown in Table 1. For 
comparison, recall scores from our earlier study are also presented. For 
context-dependency to be present, the effect of the environment on recognition should 
depend on the environment of original learning. This would be represented by an 
interaction between the cross-totals (DD+ WW) and (DW +WD) in Table 1. Since the 
cross-totals were almost identical (1-44 and 1-45) there is clearly no trace of an interaction. 
Examination of the false positive rates suggests that the absence of an interaction is not due 
to the application of different criteria across the four conditions. This conclusion is 
reinforced if detections and false positives are combined to give a d’ measure; the slight 
difference in cross-totals (3-30; 3-52) is in the opposite direction to that predicted on the 
assumption that the recognition performance is context dependent. The only significant 
main effect was that of the environment of original learning, with more words recognized 
when learning took place on dry land than were recognized when it took place under water 
(P « 0-01). This could reflect either distraction caused by the less familiar underwater 
environment, or caused by possibly slightly noisier presentation conditions under water. 
The absence of an interaction is in marked contrast with the results of the previous study 
which used recall rather than recognition. 


Discussion 

Our results show no evidence for a context-dependent effect. As such they are consistent 
with the experiments described previously, and support the view that recognition memory 
1s resistant to extrinsic context dependency. Why should this be so? 

Let us consider first the two most prominent accounts of retrieval effects, namely the 
encoding specificity hypothesis and the list tagging hypothesis. An encoding specificity 
approach such as that advocated by Tulving & Thomson (1973) fits the recall data very 
well. If one assumes that environmental cues are encoded at the time of learning, then it 
follows that reinstating such cues should facilitate recall. It should, however, also facilitate 
recognition; indeed much of the strongest evidence supporting the encoding specificity 
hypothesis explicitly uses recognition rather than recall (Tulving & Thomson, 1973). 
Should one then conclude that the stimuli were not encoded during learning? If this is the 
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case, why then should reinstating the context enhance recall? It is far from clear how the 
encoding specificity can account for the observed results. 

A list tagging hypothesis (e.g. Anderson & Bower, 1973) might attempt to account for 
the recall results by suggesting that contextual stimuli become associated with the list 
during learning, and serve as tags which help differentiate the material to be learnt from 
irrelevant material learnt elsewhere. Reinstatement of the environmental context 
presumably primes that particular tag, making it more accessible and more usable. Once 
again, however, it is not clear why a similar effect should not be found with recognition. 

One possibility is that the limit of performance in the recall task is set by retrieval, since 
there is good evidence in free recall to suggest that subjects typically learn much more than 
they can recall (see Baddeley, 1976, pp. 285-287 for a review). In the case of a recognition 
test, however, in which performance is much higher, it seems possible that the performance 
limit is set by the degree of initial learning. Such an explanation is possible but seems 
implausible since it assumes that a recognition test avoids all retrieval problems, and makes 
accessible all the information acquired during learning. Such a view is inconsistent with the 
demonstration by Brown (1964) that in a multiple-choice recognition test, if the first 
attempt is incorrect, then the second attempt has an above-chance probability of success. 
More recently, Tulving and his associates have demonstrated that subjects may be able to 
recall items they have failed to recognize (Tulving & Thompson, 1973), a result that 
suggests strongly that recognition does not eliminate the retrieval process, although it may 
influence it. 

A further possibility, suggested by an anonymous reviewer, stems from the observation 
that in our experiment, extrinsic context is a feature which is common to all items in a list, 
whereas typically intrinsic context changes from item to item. This is certainly typically the 
case, and it would be of interest to separate these factors, although in practice this might 
prove very difficult to achieve. It is not clear, however, how the outcome of such a 
hypothetical study would help explain our present results. For both recall and recognition 
the context is constant throughout a particular list, so that any differences obtained seem 
unlikely to be attributable to failure to manipulate this particular variable. 

A possible explanation is offered by Brown’s suggestion that ‘the primary difference 
between recall and recognition is that in recall access to the unit word code must be from 
the context code but in recognition access is guaranteed by the physical presence of the 
word itself? (Brown, 1976, p. 10). Since access from the context to the word is unreliable, 
anything which enriches this link by reinstating the original physical circumstances will 
enhance performance. In contrast to this, access to the word code from the printed word 1s 
much stronger and more reliable, and is unlikely to be facilitated by an extrinsic contextual 
cue. 

Taken at face value, however, such an argument would suggest that recognition memory 
should be equally insensitive to the effects of intrinsic context, and that items which can be 
recalled should almost invariably be recognized. Both of these conclusions are inconsistent 
with available evidence (e.g. Light & Carter-Sobell, 1970; Tulving & Thomson, 1973). 
However, this apparent paradox ceases to be a problem if one makes the plausible 
assumption that intrinsic context directly influences what is learned. This is most obvious 
in the case of the polysemous words used by Light & Carter-Sobell (1970), where the 
semantic interpretation of the jam is completely different, when accompanied by the 
strawberry, from when it is presented with the cue traffic. The fact that both semantic 
meanings share a single graphemic and phonological code is irrelevant if learning occurs at 
the semantic level. 

The same argument can be applied in the less extreme case such as that of Tulving & 
Osler (1968) who presented a word such as city with such separate cues as dirty and village. 
A concept like city is semantically very rich; the aspect of it that will be encoded, given the 
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cue dirty, is likely to include such associated features as garbage, traffic fumes and dust; in 
contrast, the city-village pairing suggests a quiet, clean, friendly enclave within a city. 

A similar interpretation can be made of the experiments by Watkins et al. (1976), and by 
Winograd & Rivers-Bulkley (1977), both of which show that the recognition of a 
photograph of a face may be influenced by the presence of a second face. At first sight, this 
might seem to be inconsistent with our claim that extrinsic context does not influence 
recognition. Typically, however, the subject is instructed to relate the faces, either explicitly 
as in the Watkins et al. instruction to judge the compatibility of the people depicted, or 
implicitly as in Winograd & Rivers-Bulkeley's requirement that the subject make a decision 
as to the friendliness of the two people depicted. Since they were portrayed facing each 
other, such an instruction could easily be interpreted as requiring a judgement of the 
implied relationship between the two, hence again causing the encoding of one face to be 


influenced by the nature of the other. 


In brief, we suggest that intrinsic context affects recognition memory because the context 
determines what is learned, and subsequently guides the subject back to the interpretation 
of the stimulus that occurred during acquisition. By definition, extrinsic context bears a 
purely arbitrary relationship with the material learned. As such, it does not determine the 
interpretation of the material, and hence can contribute nothing to the already powerful 
cues presented by the physical presence of the words to be remembered. 
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The validity of age-of-acquisition ratings 
K. J. Gilhooly and M. L. M. Gilhooly 





A number of recent studies have reported that words rated by adults as early acquired are retrieved 
more rapidly than words rated as later acquired. Age-of-acquisition ratings have been found to be 
highly reliable, but the validity of the ratings has been little investigated. This paper reports two 
validation studies. In the first, words were taken from a standardized vocabulary test and rated by 
naive adult subjects for age-of-acquisition. The adult ratings agreed closely with the rank order of the 
vocabulary test words, an order based on age norms. In the second study, words that had already 
been rated on age-of-acquisition by adult subjects were given as a vocabulary test to groups of 
subjects ranging in age from approximately 5-21 years. Objective estimates of age-of-acquisition from 
the vocabulary test data were highly correlated with the subjective age-of-acquisition ratings. In both 
experiments, multiple regression analysis indicated that rated age was the major independent 
predictor of the objective age-of-acquisition indices. It was concluded that ratings are valid measures 
of true age-of-acquisition. 





A number of recent studies have found evidence to suggest that words acquired early are 
retrieved more rapidly from the mental lexicon than words acquired later. Carroll & White 
(1973), in an experiment based on Oldfield & Wingfield (1965), found that the rated 
age-of-acquisition of the names was highly predictive of picture naming speed and was a 
more important factor than the objective frequency measures provided by the 
Thorndike-Lorge (1944) and the Kutera—Francis (1968) counts. The role of rated 
age-of-acquisition in picture naming has been confirmed in later studies by Lachman (1973) 
and Lachman et al. (1974) in which the codability of the pictures was taken into account as 
well as picture name frequencies. Loftus & Suppes (1972), using the retrieval task of 
category-instance naming (e.g. ‘Name a fruit’), found that Thorndike-Lorge juvenile 
frequency counts (of both category and instance names) were better predictors of speed of 
instance production than the corresponding Thorndike-Lorge adult counts. In this task, 
then, frequency of childhood usage predicted retrieval speed better than adult frequency, a 
finding which again points to an age-of-acquisition effect. In the case of anagram solving, 
Stratton et al. (1975) reported that the speed and likelihood of solving anagrams were 
predicted better by the rated age-of-acquisition of the solution word than by its familiarity, 
frequency, imagery and meaningfulness. Gilhooly & Johnson (1978), however, found that 
although rated age-of-acquisition had a significant simple correlation with anagram 
solution probability, the effect dwindled when measures of the letter order predictability of 
the solution (Mendelsohn, 1976; Gilhooly, 19785) were partialled out. More recently, 
Gilhooly (19784) found rated age-of-acquisition to be a significant factor in a word 
completion task (e.g. ‘Fill in the blanks to make a word, ST — — —."), in that early acquired 
words were more likely to be produced than later acquired words. 

Age-of-acquisition ratings have been found to be highly reliable (e.g. Gilhooly & Hay, 
1977) but how the data on age-of-acquisition ratings and performance are to be interpreted 
depends, of course, on the validity of the ratings. If the ratings are valid indices of true 
age-of-acquisition, then the rating procedure is a very convenient way of obtaining such 
measures. It seems important, therefore, to determine the extent to which adult ratings 
reflect true age-of-acquisition. 

Some work has already been done on the validation problem. Carroll & White (1973) 
assigned objectively based age indices to the words in their study. These indices were 
derived from existing tables of the frequency with which the words were known in reading 
and used in writing by grade school children (Rinsland, 1945; Dale, 1948). The subjective 
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ratings obtained from adults correlated to a satisfactory extent with the objective age 
indices (r — 0:847), indicating some degree of validity for the ratings. The studies reported 
here were designed to add to the stock of validation data and to obtain more direct 
objective measures of age-of-acquisition than Carrol! & White employed. 

Two approaches to the validity problem were followed in the present paper. In Expt 1, 
words for which objective norms were available were taken from a standardized vocabulary 
test and naive subjects rated these words for age-of-acquisition. The correlation between 
age ratings and age norms was then obtained. In Expt 2, words for which age-of-acquisition 
ratings were already available were given to children of varying ages as a vocabulary test. 
The responses of the different age groups were then used to calculate objectively based 
estimates of age-of-acquisition. These estimates were then correlated with the subjective 
ratings. 


Experiment 1 
Method 


Word selection. All 40 words from Set One of the Crichton Vocabulary Scale (1950) were selected. Set 
One of the Crichton Vocabulary Scale consists of the first 20 words of the Mill Hill Vocabulary Scale, 
Set A and the first 20 words of the Mill Hill Vocabulary Scale, Set B. 

Added to the 40 words from the Crichton Vocabulary Scale (Set One) were the last 13 words from 
the Mill Hill Vocabulary Scale, Junior Form, Set B. This made a total of 53 words. The Crichton and 
Mill Hill Vocabulary Scales were chosen because they are well-established scales that were developed 
and validated with British subject samples. 

The words in the Mill Hill Vocabulary Scale are arranged in order of the frequency with which 
they are usually known. According to the norms presented by Raven (1954), between the ages of 5 
and 16 years there is an increment of one to two words per year. The order of the words in the 
Crichton Vocabulary Scale is based on the frequency with which children under 11 years of age were 
able to explain their meanings. Tables of norms presented by Raven (1961) for the Crichton 
Vocabulary Scale indicate that between the ages of 5 and 11 years there is a fairly uniform increment 
of approximately three words known each year for Set One. The 53 words used are given in Table 1. 


Table 1. Words from Crichton and Mill Hill vocabulary tests used in Expt 1 








Crichton Vocabulary Test — Set 1 Mill Hill — Set B 
l. cap 15. unhappy 29. resemblance 4l. dwindle 
2. tomato 16. perfume 30 brag 42. lavish 
3. frock 17. ache 31. anonymous 43. whim 
4. rest 18. view 32. liberty 44. surmount 
5. patch 19. receive 33. mingle 45. bombastic 
6. damp 20. continue 34. fascinated 46. recumbent 
7. loaf 21. startle 35. courteous 47. envisage 
8. cruel 22 connect 36. prosper 48. trumpery 
9. afraid 23. stubborn 37. elevate 49. glower 

10. blaze 24. provide 38. thrive 50. perpetuate 

1l. near 25. squabble 39. precise 51. levity 

12. battle 26. shrivel 40. verify 52. libertine 

13. rage 27. malaria 53. amulet 

14. disturb 28. schooner 





Age-of-acquisition rating scales. The 53 words were randomized and presented on four pages which 
were randomly ordered into booklet form. Alongside each word was a nine-point scale, ranging from 
age 1—2 years to age 17 and older. 


Subjects. The subjects were 70 students in psychology at Aberdeen University. The forms were 
administered during two laboratory classes. There were 20 male and 50 female subjects. 
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Procedure. The subjects were given the forms and asked to indicate on the rating scale when they 
thought they had learned the words presented. In order to compare age-of-acquisition ratings with 
word frequency measures, frequency scores were obtained from two sources, (i) the Thorndike-Lorge 
count (Thorndike & Lorge, 1944), and (ii) the Brown University word count (Kučera & Francis, 
1967). 


Results and discussion 


Each of the 53 words was scored on the following variables. 
(1) Average rating by 70 subjects on the nine-point scale 
(2) Rank position in the Mill Hill/Crichton tests (later positions being more difficult, 
presumably later acquired words). This measure is referred to as ‘Mill Hill rank position’ 
in the remainder of this paper 
(3) Length of word (in letters) ! 
(4) Thorndike-Lorge frequency (frequencies of ‘A’ were scored as 50 and frequencies of 
‘AA’ as 100) 
(5) Kucera-Francis frequency 

Since both frequency measures yielded positively skewed distributions the 
Thorndike-Lorge and Kuéera—Francis scores were logarithmically transformed before 
further analysis. The logarithmic transformations corrected the skews satisfactorily and 
markedly increased the magnitude of the correlations between the frequency variables and 
the criterion variable. 


Table 2. Correlations and summary statistics for Expt 1 (n = 53) 


Measure I 2 3 4 5 

l. Mill Hill rank position 1-0 0-93 0-53 —0:77 —0:58 

2 Rated age-of-acquisition 1-0 0-55 —0:79 —0:62 

3. Length of word 1-0 —0:48 —0 25 

4. Log Thorndike-Lorge frequency 1-0 0-75 

5. Log Kucera-Francis frequency 1-0 
Mean 27-0 4:37 6°59 2:52 1:84 
SD 15-44 1-70 1-83 1:52 1-68 








Note. Correlations significant at 0-05 level (on a two-tailed test) are given in italics. 


Pearson correlations among the measures and summary statistics are given in Table 2. It 
is clear from Table 2 that rated age-of-acquisition was the best single predictor of Mill Hill 
rank position (r = 0-93), with log. Thorndike-Lorge frequency next best (r = —0-77). 
Simultaneous multiple regression analysis (Table 3) indicated that rated age was the only 
variable that made a significant independent contribution in predicting the criterion measure 
(Mill Hill rank position). Thus, the correlations between the frequency scores and the 
criterion were mainly due to their confounding with the major predictor, rated age. 

These results are consistent with the notion that age ratings do reflect order of acquisition. 
In this study we did not have measures available on other word attributes such as 
concreteness, imagery or meaningfulness that might also be relevant, or possibly even be 
better predictors of order of acquisition than age ratings or frequency scores. The next 
study remedies this deficiency. 


108 K. J. Gilhooly and M. L. M. Gilhooly 


Table 3. Simultaneous multiple regression analysis of Expt 1 data. Predicted variable, Mill 
Hill rank position (n — 53) 








Variable B Beta SE B F 

Rated age-of-acquisition 772 0-850 0-835 85.45*** 

Length of word 0-06 0 007 0-555 0-01 n.s. 
Log Thorndike-Lorge —131 —0-129 1-067 1-51 n.s. 
Log Kutera-Francis 0-41 0 044 0-762 0 29 n.s. 
*** P —( 001. 


Note. R = 0:930, R? = 0:87, F = 77-06***, d f. = 4, 48. 


Experiment 2 
Method 


Word sample. Forty-eight five-letter words were taken from Gilhooly & Hay's (1977) list so as to 
represent a spread of scores on rated age-of-acquisition. The words were divided into two sets of 24 
words and are listed in Table 4. 


Subjects. Twenty children from Primaries 1, 3, 5 and 7 at Ashley Road Primary School, Aberdeen, 
and 10 from the first-year class at Aberdeen Grammar School took part. The average ages of these 
groups were 5:5, 7:5, 9:5, 11:6 and 13:1 (years and months) respectively. 

Twenty student subjects from Aberdeen University also took part. The average age of this group 
was 21:1. 

All groups were balanced for sex. 


Procedure. The primary groups were divided in two, and one half was tested with the 24 words in Set 
1, while the other half was tested with the 24 words in Set 2. The children were asked to say what 
each word meant. Children were tested 1ndividually, using an oral presentation. Words were given in 
a different random order to each child. A generous criterion was used in assessing the answers since 


Table 4. Words used in Expt 2 in order of rated age-of-acquisition within each set? 








Set 1 Set 2 
l. house 13. baton l. chair 13. mercy 
2. mouth 14. token 2. thumb 14. brink 
3. uncle 15. gnef 3. black . 15. cramp 
4. light 16 index 4. fairy 16. flank 
5. clown 17. havoc 5. glove 17. clamp 
6. prize 18. fraud 6. fight 18. query 
7. chalk 19. lyric 7. cloth 19. fhrt 
8. trick 20. unity 8. plank 20. wench 
9. blade 21. opium 9. grave 21. vodka 
10. scout 22. bigot 10. dunce 22. logic 
ll. force 23. fovea 11. hound - 23. forum 
12. pylon 24. odium 12. judge 24. ovary 


2 The age-of-acquisition ratings (Gilhooly & Hay, 1977) were obtained following the method used by 
Carroll & White (1973). Student subjects were given seven-point scales and were asked to indicate 
when they thought they learned each word. The scale ranged from age 0—2 years (each of the seven 
points represented a 2 year period) to age 13 and older. 
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we were interested in whether each word was known rather than in the ability to give detailed 


dictionary definitions. 


Words which were not responded to acceptably by at least 50 per cent of the Primary 7 groups 
were then used in testing the first-year secondary and student samples. With these groups a written 
presentation and response method was used and subjects were tested in groups rather than 


individually. 
Results and discussion 


Each word was scored in terms of the number of subjects giving acceptable responses in 
each age group. These scores were then used to calculate, by interpolation, the average age, 
in months, at which 50 per cent of subjects would be expected to be correct from our data. 
The calculated ages were taken as chronological estimates of age-of-acquisition. (Two 
words, FOVEA and ODIUM, were not known by half our student sample and these were 
dropped from the analysis as it was not possible to use the interpolation procedure on 


these words.) 


Measures were available on each word of rated age-of-acquisition, imagery, concreteness, 
familiarity and meaningfulness (Gilhooly & Hay, 1977). Thorndike-Lorge and 
Kuéera-Francis frequency scores were also obtained for each word. These frequency 
scores were subjected to logarithmic transformations to correct for positive skews. 
Summary statistics and intercorrelations among the measures are given in Table 5. 


Table 5. Correlations and summary statistics for Expt 2 (n — 46) 


Measure 


Chronological age estimate 
Rated age-of-acquisition 
Imagery 

Concreteness 
Meaningfulness 
Familianty 

Log Thorndike-Lorge 

Log Kutera—Francis 


eo vta de oto 


Mean 
SD 


Note. Correlations significant at 0-05 level (on a two-tailed test) are given in italics. 


1 


1-0 


100.98 
46-53 


2 


0-84 
1-0 


3-87 
1-64 


3 4 
—0-63 —0-42 
—0-71 —0-47 

1-0 0-82 

1-0 

4-63 490 

1-38 1-35 


5 


—0-42 

—0-47 
0:62 
0-54 
10 


4 96 
0-87 


6 


—0 65 

—0:67 
0:43 
0:30 
0-47 
1-0 


4:17 
1:05 


7 


— 0-68 

—0:70 
0:30 
0-06 
0-28 
0 68 
1-0 


2-80 
1:39 


Table 6. Simultaneous multiple regression analysis of Expt 2 data. Predicted variable, 
chronological age estimate (n = 46) 








Variable B Beta SEB F 
Rated age-of-acquisition 14-25 0-504 4 976 8-20** 
Imagery —4-47 —0:136 6-766 0-46ns 
Concreteness — 1-27 —0-037 5 392 0-06 n.s 
Meaningfulness 2-06 0-039 5.978 0-12 n.s 
Familiarity —6:28 —0-142 6:341 0-98 n.s 
Log Thorndike-Lorge —9-96 — 0:297 5-692 3-06 n.s. 
Log Kuéera—Francis 3-17 0-121 3-901 0-66 n.s 





*** P «0-001; ** P< 0-01. 


Note R — 0-861, Rè = 0 74, F = 15 58***, d.f. = 7, 38. 
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It is clear from Table 5 that rated age is the best predictor (r — 0-84) of the chronological 
estimates, with log Thorndike-Lorge frequency next best (r = — 0-68). Simultaneous 
multiple regression analysis (Table 6) indicated that rated age was the only variable that 
made a significant independent contribution in predicting the criterion measure 


(chronological age estimate). 


Rated age, on its own, accounts for 70 per cent of the variance in the chronological 
estimates while the remaining six predictor variables account for less than a further 4 per 


cent of the criterion variance. 


The results of Expt 2 are consistent with those of Expt 1 and support the hypothesis that 
adults' ratings are valid indices of age-of-acquisition. The question remains, of course, of 
how adults make such estimates. We intend to tackle this question in future studies. 
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Cultural anchorages: Artifacts in field and laboratory studies 
M. K. MacNeil, D. J. Pace and R. A. Wark 





Frequency distributions compiled from data in two unrelated judgement situations were analysed 
with regard to the greater frequency of certam numbers relative to other numbers. In a laboratory 
situation subjects were requested to give their ‘best estimates’ of the distance between two 
tachistoscopically presented points of light. In the second situation, data were collected from hunters 
who estimated the number of quail they had killed dunng the preceding season. The frequency 
distributions from both studies contained significantly more anchored than non-anchored judgements 
Analysis of the numbers given in the laboratory (hex situation) study also showed a strong preference 
for *multi-anchored' rather than non-anchored numbers. In the field (quail-kill) study the 
*triple-anchored' numbers were the estimates given with proportionately the greatest frequency. 





Ask a person to tell you how many books he read last year, how many new people he has 
met in the past 5 years, or the distance to a location he sometimes visits, and you are 
setting up an estimation situation which demands a quantified response. Few will hesitate 
to give a response in a form which appears to be numerically precise. Both field survey 
studies and those purportedly more precise studies done in the laboratory depend on 
respondents or subjects stating their perception, or memory, of the attribute in question in 
what is presumed to be an accurate quantified form. 

The differential representation of items that stand side by side on a reference scale and 
appear, objectively, to be equally probable of inclusion can be explained 1n terms of 
anchorages, which may be defined in terms of the relatively high saliency of some of the 
items making up a reference scale. Anchorages may be in either the physical, i.e. external, 
properties of the stimuli or in the internal scales on which stimuli are being evaluated. For 
example, when the ‘method of absolute judgement’ is employed, the subject ‘must make 
his judgement by referring to some "internal" measuring stick which, presumably, he has 
built up by past experiences’ (Underwood, 1966, p. 49). 

The effect of anchorages upon the reference scales people develop and use in making 
estimations and judgements, has been demonstrated in laboratory measures of both 
psychophysical judgements and judgements on social issues (Tresselt, 1947, 1948; 
Volkmann, 1951; Fehrer, 1952; MacNeil & Sherif 1976). Further, reference scales and their 
related anchorages are used in everyday life and in placing, for example, individuals as well 
as social units along attitude spectra of various kinds. There is, however, little or nothing 
1n the social research literature specifically dealing with the evidence of the effects of those 
phenomena which appear to be the result of the psychological processes involved in human 
judgement-making and comparison. 

Many sources of reference scales used by the individual are provided by his culture. 
Sociogenic reference scales include categories of seasons, colours and other arrangements 
provided for classifying the physical, as well as the social, world in which people live (Sherif 
& Sherif, 1969; Farb, 1974). Because making a judgement involves, by definition, making a 
comparison, these culturally provided scales provide the means for placing both concrete 
and abstract items into a meaningful scheme of things. 

In our counting and measuring systems there are four sources of anchorages. The first 
two of these sources are combined and defined for the purposes of this study as the 
‘foot-dozen’ system of anchorages. The third source is closely related to the above and 
labelled the ‘even-number’ anchorages. The fourth source 1s the decimal system and its 
relevant anchorages are so designated. The anchorages derived from counting and 
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measuring by the foot-dozen system are the multiples and commonly used fractions of 12. 
Examples of such anchorages are 4 items or 4 inches (a common fraction of a dozen or a 
foot), 6 items or 6 inches (one-half a dozen or foot), 12 items or 12 inches (one dozen or 
one foot) 18 inches (one-and-a-half dozen or feet), etc. The even-number anchorages 
apparently derive from the common practice of counting by twos because of the 
awkwardness of dividing and multiplying in the course of everyday use of the foot-dozen 
system. The even-number anchorages are 2, 4, 6, 8,...14, 16,...22, etc. The 
decimal-system anchorages are applicable to both counting and measuring, but the system's 
application in our culture is still limited, principally, to monetary purposes. Despite this 
limitation in everday usage there appears to be some generalization of the decimal 
anchorages to both counting and measuring reference scales. The decimal anchorages are 
10, its multiples, and commonly used fractions. The latter for the most part consists of five 
and its multiples since the decimal system does not break down conveniently into whole 
integers except for units of one and five. 

Some numbers, of course, are doubly anchored: 20 is a part of the decimal system as 
well as being an even number. Both 30 and 60 are triply anchored in that they are: even 
numbers, multiples of foot-dozens or half foot-dozens, and multiples of 10. 

While replicating Sherif's (1935) study of social norm formation in the laboratory, using 
a recently developed apparatus (the ‘hex’ situation) for generating stimuli analogous to 
autokinetic stimuli (Pace & MacNeil, 1974), it was found that anchorages affected 
distributions of the judgements. (The task in the hex situation is to judge the distance 
between two lights presented tachistoscopically [0-5 s] at the same time.) 

A casual examination of the data produced by single subjects in the hex situation under 
*alone' conditions, i.e. without anyone else in the situation, indicated that there were 
actually more judgements made of 14 inches and 16 inches than of 15 inches, which was, 
on every trial, the physical distance between the two lights. This evidence led to a 
consideration of the effect of anchorages in general and, specifically, on how anchorages 
affected estimations in non-laboratory situations. 

A reasonably accessible source of data which appeared suitable for comparison with 
these laboratory data was found in the ‘hunters’ survey of game killed’ conducted by the: 
Oklahoma Department of Wildlife Conservation in 1971. The data examined were drawn 
from reports of the number of quail reported killed by each licensed hunter within the 
State. The bag limits on quail allowed per day and season are reasonably large in 
Oklahoma. Also, the animal hunting season of approximately 8 weeks was deemed of such 
length as to ensure that most hunters would make estimates of their quail kills rather than 
resorting to accurately kept records. 


Method 


A comparison of the two judgement-estimation situations, laboratory and real-life, are reported here 
The data were derived from subjects making estimates of the distance between two points of light in 
the laboratory and from hunters' reports of quail killed. 


Hex situation 


The laboratory data were the verbally given estimates of 18 male college students making judgements 
of the distance between two | mm diameter points of light, presented at the same time for 0 5s, in a 
totally dark room (the hex situation). The angle of the axis of the pairs of lights was randomly varied 
from presentation to presentation. The acutal distance between the lights, in every case, was 15 in. 
Time between presentations of the stimuli was 60 s. Subjects, seated behind a table and facing the 
stimulus source, did not know their distance from the stimuli (16 ft). Subjects never saw the 
experimental room in the light and were therefore unaware of its dimensions. The stimuli were 
presented in a plane vertical to the subject's line of sight The centre of a 30 in circle which 
encompassed all the pairs of stimulus lights and their varied axes was 5 ft above floor level. 
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Each subject remained for 2 min in the dark adaptation room under dim, red illumination while 
the experimenter presented general instructions as follows: 

This is a judgment situation in which we are trying to determine how well people judge distance 
at night. Your task will be to judge the distance between two points of light. We will enter this 
other room [experimenter indicates the experiment room door] and then stop just inside the 
door. You will notice a curtain drawn across the end for the entrance booth. After shutting the 
door I will open this curtain and lead you to your seat. The reason this is done 1s because the 
room 1s completely dark. I will then walk to my machine and turn it on. Then I will give you a 
signal (I will say ‘ready’) and show you two points of light. In less than a second the lights will 
disappear. Then tell me the distance between the two points of light. Try to make your estimates 
as accurate as possible. 

Following dark-adaptation the experimenter guided the subject into the hex laboratory and placed 
him in a chair behind a table so that he directly faced the apparatus. After seating the subject behind 
a table in the laboratory the experimenter moved to a position directly ın front of the subject and 
said the following: 

I will give you a signal by saying ‘ready’ and show you two points of light. A second or so later 
the lights will disappear. As soon as the lights disappear tell me the distance between the points 
of light. Try to make your estimates Just as accurate as possible. 
The experimenter then made his way in the totally dark room to the stimulus apparatus On his 
way to the apparatus the experimenter said: 
There will be one trial run so that you can get used to the machine. Tell me when you see two 
lights. After you have seen two lights I will say ‘ready’, then give me judgements on the next set 
of lights. 

Each of the 18 subjects made 96 judgements, aloud, which were recorded by the experimenter 


Hunters’ survey 
The field data were hunters' reports of quail killed made to the State of Oklahoma Wildlife 
Commission for the 1970 hunting season. Twenty-eight counties were selected randomly from the 77 
counties in Oklahoma and provided reports of 1241 hunters. Only reports of from nine kills to 60 
kills were utilized in the study. The limits decided upon reflected the assumption that reports of below 
nine were more likely to be counts than estimates, and the sparsity of reports of over 60 killed. This 
restriction reduced to 581 the sample used in the analysis. 
Each licensed hunter was mailed a questionnaire which included the following instructions: 
A questionnaire is enclosed to report the results of your hunting during the past season. Please 
complete the questionnaire and mail immediately The card does not require postage. Please fill 
out the card completely. Do not report the game harvest of any other sportsman with whom you 
may have hunted 


Results 


The data generated in the hex laboratory were in the form of repeated measures, with each 
of the 18 subjects giving 96 judgements. On the other hand, the reports of the number of 
quail killed were statistically independent, i.e. only one such judgement was reported by 
each hunter. Therefore, statistical analysis was conducted separately for the two types of 
data. 

For the purpose of analysing both the laboratory and the field data, numbers were 
classified as being either non-anchored or anchored. Anchored numbers were broken down 
into either single-anchor or multi-anchor numbers. The multi-anchor numbers were 
designated as either doubly or triply anchored. Non-anchored numbers include the 
numbers 1, 3, 7, 11, etc. Single-anchor numbers are numbers such as 4, 5 and 8. 
Multi-anchor numbers include both double (6, 10, 12, etc.) and triple (30 and 60) anchors. 


Hex situation 

The data derived from the 18 subjects' judging the distance between lights were analysed 
with the Wilcoxon matched pairs signed rank test (Siegel, 1956). For each subject, the 
number of his single-anchor judgements was matched against the frequency of his 
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non-anchor judgements. Of the 34 numbers (from 3 in to 36 in) given by all subjects as 
estimates of the distance between lights, 13 of those numbers were in the single-anchor 
classification. There were also 13 non-anchored numbers within the same range. Analysis of 
the data indicates a highly significant difference between the use of single-anchor over 
non-anchored numbers (1 — 14-5, z — 3-09, P — 0-0010, one-tailed). Only eight of the 
numbers between 3 in and 36 in inclusive are in the multi-anchor classification, that is are 
either double or triple anchored. Nevertheless, analysis showed a highly significant 
preference for multi-anchored rather than non-anchored numbers (t — 15, z — 3:06, 

P — 0:0011, one-tailed). 


Hunters’ survey 


The quail-kill data were analysed in terms of non-anchored versus anchored categories. A 
comparison was also made among (1) non-anchored, (2) single-or double-anchored, and (3) 
triple-anchored categories. Chi-square analyses (Siegel, 1956) were computed, with 
‘expected frequencies’ based on the assumption of equal probabilities of occurrence of each 
number. The analysis (two-way) indicated a highly significant preference for the use of 
anchored rather than non-anchored numbers (y? = 221:3, P < 0-001). Likewise, a 
three-way analysis (non-anchored versus single or double-anchored versus triple-anchored 
numbers) showed a highly significant difference (y* — 290-7, P « 0-001), with 
triple-anchored numbers being used with proportionately the greatest frequency (3-2 times 
the ‘expected’ frequency). 


Summary and discussion 


The frequency distributions from both the laboratory and the field studies contained 
significantly more ‘anchored’ than ' non-anchored' judgements. Analysis of the numbers 
given in the hex situation also showed a stronger preference for *multi-anchored' rather 
than ‘non-anchored’ numbers. Also in the quail-kill data the two 'triple-anchored' 
numbers were given as estimates with proportionately the greatest frequency. 

No meaningful test of the strength of triple anchors could be made on estimates given in 
the hex situation. Within the range of judgements given, only one number - 30 — was 
categorized as a triple anchor. Judgements in the laboratory situation ranged only up to 
36, with only 23 judgements (out of 1728) given from 30 in to 36 in. With an actual 
distance between the lights set always at 15 in, the possibility of testing for the strength of 
triple anchors in this situation was eliminated. It might be noted, however, that 12 of the 
23 judgements within this range were of 30 in. 

One reservation to the comparison is that each subject in the hex situation made a 
number of judgements regarding the stimuli presented; whereas the hunters reported only 
one estimate regarding the number of quail they had killed. As a result, there can be no 
direct statistical comparisons between the results of the two studies. Nonetheless, it is 
apparent that in both the laboratory and in the field conditions cultural anchorages were 
operating to inhibit the production of unbiased estimates. 
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Conservation without conversation? An alternative, non-verbal paradigm for 
assessing conservation of liquid quantity 


Kevin Wheldall and Barbara Poborca 





A non-verbal paradigm for assessing conservation based on an operant discrimination learning 
procedure 1s described Children were trained to press a button when shown two jars containing 
equal amounts of water and to refrain from pressing when the amounts were unequal. In this way 
children were taught to respond to a non-verbal request by wordlessly signalling their evaluation of 
the relationship between two quantities When criterion was reached, one of two quantities previously 
judged equal was poured into a different shaped jar and an evaluatory response to this transformed 
stimulus was non-verbally requested. Initial results suggest that young children who could not 
conserve within the traditional verbal procedure were more likely to demonsirate conservation within 
the non-verbal paradigm and that traditional Piagetian tasks are verbally biased. 





‘As regards concrete operations, Piaget considers language. ..not even a necessary 
condition for their constitution' (Sinclair-de-Zwart, 1969). It is a curious anomaly, 
therefore, that language remains a necessary condition for assessing the attainment of the 
concept of conservation which inaugurates the stage of concrete operations at around 7 
years. Piaget’s implicit assumption about receptive language development constitutes a 
potential flaw in his assessment procedure; he assumes that by the age (or stage) when the 
child has acquired the cognitive concept, his level of receptive language development will be 
sufficient to enable him to understand the complex questioning involved in the 
conservation assessment procedure. Yet as early as 1960, Roger Brown referred to 
problems engendered by ‘language lag’ as ‘one reason for giving priority to the non-verbal 
criteria’ since ‘a child might have the ability to perform a given operation without having 
the verbal knowledge to comprehend instructions’ (quoted by Braine, 1962). 

The work of Donaldson and her colleagues (1968, 1970, 1974) and others (Maratsos, 
1973; Webb et al., 1974; Brush, 1976) on semantic development demonstrates how 
inconsistent and inappropriate are young children’s interpretations of ‘same’, ‘more’, ‘less’ 
and ‘different’, which are crucial in the quasi-standardized questions which have gradually 
replaced Piaget’s ‘clinical’ interrogations. Children’s differing reactions to different forms of 
questions in the same simple situations have prompted experimenters to investigate the 
effect of changing the wording (Nair cited 1n Bruner, 1966; Hamel, 1974; Russell, 1975). 
Rothenberg (1969) points out that ‘among the subjects who fail to conserve, it has not 
been possible to know whether this failure was due to the inability to understand the 
language of the question, the concept of conservation, or both'. Various methods have 
been employed in an attempt to avoid this problem. Rothenberg (1969), P. Miller, (1973) 
and others assessed their subjects’ language comprehension prior to conservation testing, 
some omitted those children who could not comprehend the instructions, whereas others 
(Braine & Shanks, 1965; Gruen, 1965; Rothenberg & Orost, 1969) taught them the 
appropriate words. Neither technique is satisfactory; one merely avoids the problem by 
eliminating *non-comprehenders' whereas the other trains the words used within a different 
context. Berko & Brown (1960) have suggested that a child's non-conserving judgements 
may be directly due to the limitations of his vocabulary since he may learn the words 
‘more’, ‘less’ and ‘the same’ in simple situations but fail to make appropriate 
generalizations to more complex usage. 

The importance of context generally constitutes a further, related, criticism of Piaget’s 
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traditional paradigm. Rose (1973) and Rose & Blank (1974) have demonstrated how 
children’s efforts to agree with the experimenter lead them (a) to an overindulgence in 
affirmative ‘yes, yes’ replies or (b) to change their response to reiterated questions and 
altered circumstances, or (c) to focus their attention on the manipulated properties (see also 
Piaget, 1968; Gelman, 1969; P. Miller, 1973). In McGarrigle & Donaldson’s (1976) 
ingenious experiment many children failed to conserve when the experimenter transformed 
a row of counters, yet the same children conserved when a ‘naughty’ teddy bear was 
manipulated to deform the row of counters ‘by accident’. Greenfield (1966) also found that 
rural African children conserved more frequently when they made the transformations 
themselves since they were then less inclined to attribute ‘magical’ effects to the action. 
Hunt’s (1975) work, following Orne’s (1962) studies on ‘demand characteristics’, also 
suggests that adult expectancy is yet another extralogical cue which influences young 
children’s conservation judgement. 

In the traditional verbal paradigm the child’s performance depends not only on his 
comprehension of verbal instructions but also on his ability to verbalize his justifications 
for his judgement. This necessity for verbal justifications of judgements is a current focus of 
contention in Piagetian research and theory. It seems curious to demand verbal explanation 
as confirmation of a child’s ability to conserve since in many other situations (from riding a 
bicycle to performance on intelligence tests), we are content with performance alone. 
Moreover, justification may not necessarily identify true conservers. A young child’s 
incorrect justification or failure to provide a justification may result from lack of 
confidence, from uncertainty induced by this questioning of his evaluation, or from opting 
for more easily described, but less accurate, explanations as well as from inadequate verbal 
facility. On the other hand, we have evidence which suggests that some young children are 
able to produce acceptable justifications whilst manifestly failing the tasks (Poborca, 
forthcoming). 

Many experimenters, however, accept judgement(s) alone as evidence of conserving, thus 
eliminating the need for verbal explanations. In this they agree with Gruen’s (1966) and 
Brainerd’s (1973) arguments that the justifications included in Piaget’s definition of 
conservation constitute a sufficient but not necessary criterion, leading to Type II errors (or 
false negatives), i.e. denying the true presence of the concepts, whereas if care is taken in 
the design of the task a correct judgement alone can provide a sufficient and necessary 
criterion, In order to avoid Type I errors (or false positives), i.e. falsely accepting the 
presence of the concept, a test of conservation of substance should include: (a) several 
different tasks, comprising both deformation and division; (b) a demonstration of the 
transformation; and (c) the presence of both the final and initial configurations for 
comparison (Hobbs, 1975; Miller, 1976). 

Braine (1959, 1962) has long advocated the desirability of non-verbal methods for 
evaluating Piagetian concepts, arguing that ‘the frequent use of verbal stimuli introduced 
into the experimental procedure factors that are difficult to evaluate’ (1962) and that 
‘Piaget fails to eliminate important variables which are not involved in the definition of the 
processes he sets out to investigate’ (1959) 

In similar vein, Siegel (1977) argues: ‘It would seem to be an obvious paradox to 
postulate the independence of language and thought, and then to rely on language to infer 
the existence of certain kinds of thought. If in fact thought is not necessarily dependent on 
language, then it would seem obvious that non-verbal methods would serve as the only 
appropriate test of Piagetian theory’ (author’s original emphasis). 

In 1976 Miller published a review of studies on the non-verbal assessment of Piagetian 
concepts, most of which were concerned with conservation, since it is considered a central 
principle which appears to be particularly dependent on language for its assessment. He 
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points out that: ‘None of the studies. . .is literally non-verbal. All, however are intended to 
be less verbal than the classic Piagetian procedures, especially with respect to the critical 
elements of the assessment procedure, that is, the form in which the criterial question is 
posed and the manner in which the child is required to respond.’ 

Studies aiming for non-verbal assessment have made use of three basic approaches 
(reviewed Miller, 1976): 

(a) The child’s response of surprise, which is taken to indicate the violation of his 
expectancies. If he shows surprise at an apparent example of non-conservation we can infer 
that he expected conservation and therefore has some understanding of it (Mermelstein & 
Schulman, 1967; Mermelstein & Meyer, 1969; Achenbach, 1973; all cited in Miller, 

1976). 

(b) The motivated choice procedure, which requires the child to be taught to ‘find the 
candy’ under an example of the desired attribute (Braine, 1959, 1962; Mehler & Bever, 
1967; Siegel, 1971, 1972). 

(c) The instrumental choice method, which differs from the previous procedure in that 
the child is trained to ‘find the candy’ by performing an independent action or by 
matching to sample (for example the same colour). Siegel (1971) used this method to test 
conservation by conditioning the child to find a reward under the card with the same 
configuration of dots as the sample, and then testing him with cards on which the correct 
number of dots was closer together than on the original sample. Since there is no initial 
configuration or transformation, however, this task can only measure ' pseudoconservation' 
(Piaget, 1968). Sawada & Nelson (1967) and Schwartz & Scholnick (1970) also used 
variations of the instrumental choice method. 

None of the studies appears to have demonstrated the superior performance we would 
expect from non-verbal methods given that they eliminate some of the verbal ambiguity 
and confusing contextual factors inherent in traditional conservation assessment. One of 
two conclusions may be drawn from this. One might argue that they provide evidence that 
non-verbal methods fail to yield superior performance. Alternatively, one can argue, as we 
do, that these studies have merely failed to demonstrate this phenomenon due to inherent 
flaws in their design. For example in the study by Harder (1971, unpublished dissertation, 
reported in Miller, 1976) on conservation of length, first-grade children were trained 
through differential reinforcement to press one button when presented with parallel sticks 
of equal length but to press another button when shown ‘sticks’ of unequal lengths. When 
criterion was reached, a series of so-called ‘conservation’ trials was introduced among the 
training trials, 1.e. ‘transformed’ stimuli were presented. Subjects’ performances were 
almost identical on these trials and on a standard verbal test. Harder, however, only 
measured ‘pseudoconservation’ since he presented two equal ‘sticks’ in a staggered form, 
i.e. without first presenting them as a parallel equal pair or overtly disaligning them in view 
of the children. He also continued to give reinforcement on a random basis during the 
actual conservation trials, a serious flaw. Our own approach is also a variant of the 
instrumental choice method and is similar to Harder's study in some respects but differs in 
procedure and results. It also investigates continuous quantity (liquid) instead of length. 

We have attempted to devise a non-verbal paradigm for testing conservation in the sense 
of Braine's use of the term, non-verbal: *Such a method is non-verbal because, while some 
words may pass between experimenter and subject, there is no systematic association 
between any word and the cue stimuli to which response is rewarded, i.e. no verbal 
stimulus is used which might evoke the concept to be studied' (1962). Whilst not 
eliminating verbal interaction with the child, the method attempts to elicit evaluatory 
responses to non-verbally presented conservation problems. The technique relies principally 
on an operant discrimination learning paradigm with similarities to a go/no go procedure. 
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It must be emphasized that the method does not (and in fact could not) teach the child how 
to conserve or even how to produce apparently conserving responses. Rather than training the 
child to conserve, the method trains the child in a non-verbal response mode by which he 
can communicate his existing skill in conservation which, it is hypothesized, may already 
be within the child's repertoire but which cannot yet be demonstrated under verbal 
conditions. 

In fulfilling our aim of devising a relatively non-verbal paradigm for testing conservation 
we were concerned with two major hypotheses: 

(a) That significantly more young children would be able to demonstrate full 
conservation under the non-verbal paradigm than in the traditional, Piagetian, verbal tests 
of liquid conservation; 

(b) That most children who demonstrated full conservation on a verbal test of liquid 
conservation would also conserve fully under the equivalent non-verbal paradigm. 

The other main difference between this experiment and those previously reviewed is that 
we attempted to demonstrate superior performance using a non-verbal procedure to assess 
the conservation of liquid quantity. This is a particularly difficult concept to convey 
non-verbally as Piaget & Inhelder (1969) affirm, ‘since the subjects must be made to 
understand that the questions have to do with the contents and not with the containers 
themselves’. Consequently liquid quantity allows a more rigorous test of the paradigm 
since it 1s especially dependent on verbal distinctions. 


General method 
Apparatus 


The apparatus consisted of a response unit comprising a red response button mounted alongside a 
green cueing lamp on a small raised platform situated directly in front of a larger platform upon 
which stimulus arrays of jars of coloured water were presented to the children. Each child sat directly 
facing the stimulus array platform, with his preferred hand on or near the response unit. The response 
unit was wired to appropriate logic circuitry connected to a reinforcer dispenser. 

In brief, the experimenter presented the stimulus array and switched on the green light to signal the 
onset of a trial. The light stayed on for 3 seconds. The child's task was to press the response button, 
or to refrain from pressing the button, during this interval. When the two presented identical jars 
contained equal quantities of liquid, pressing the button switched off the light and discharged a 
Smartie? from the dispenser. Failure to press the button under these conditions was not rewarded by 
a Smartie®. When the two identical jars contained unequal quantities of liquid, refraining from 
pressing the button while the light was on was rewarded by a Smartie®. If the child pressed the 
button under this contingency, no Smartie? was delivered but the light immediately went out, ending 
the trial, 

During the course of the training five matched sets of vessels were employed, varying in size and 
shape and including hydrometer jars, tumblers and bottles. This was to facilitate the generalization of 
the concept, and to accentuate the fact that it was the relative quantities of liquid in the vessels and 
not the vessels themselves which constituted the essential discrimination. The various pairs of 
containers and opaque jugs of pre-measured liquid were manipulated below table level on the 
experimenter’s left, out of sight of the child 


Procedure 


Upon entering the experimental room the child was seated on the experimenter’s right, facing the 
display platform The experimenter established rapport with the child and directed his attention to 
the relevant equipment, demonstrating the illumination of the lamp, pressing or not pressing the 
button and the function of the reinforcement dispenser. 


Details of response-training 


Two identical glass jars containing small (one measure) quantities of coloured water were placed on 
the display platform, the appropriate contingency was set, and the child was told: ‘When the water’s 
like this you press the red button to get a Smartie?. Try it.’ The cueing light was then switched on, 
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the child pressed the button, the light went out and a Smartie® was immediately discharged from the 
dispenser. 

The experimenter then poured a further measure of liquid into the left-hand jar, changing it from a 
small to a large quantity and so presenting an unequal large/small pair while also demonstrating that 
addition results in more. She then said: ‘But when the water's /ike this you sit back and leave the red 
button alone. Try it.’ When she had set the appropriate contingency setting she switched on the green 
light, the subject refrained from pressing the response button and was rewarded by a Smartie® 
discharged from the dispenser. Small equal pairs and large/small unequal pairs were then presented 
without any addition or subtraction of liquid until the subject consistently responded correctly, 
regardless of the experimenter’s operations in forming them. Jars were reversed in position and/or 
replaced with identical jars containing either the same or the alternative amount, ın order to present 
series of the same as well as different stimuli. When the child has only two stimuli and two responses, 
he learns that the same result can be achieved by many different manipulations and that the 
configuration of the water is his only guide. Once these two responses were learnt, two identical glass 
jars containing equal, large quantities (two measures) of liquid were presented and the child was 
instructed to ‘press the red button when the water's like this’. The appropriate reinforcement 
contingency was then set and the light cue given. Once the child had responded and had been 
rewarded one measure of water was poured out of the left-hand jar This was to demonstrate that 
subtraction results in less, whilst at the same time reducing one quantity from large to small and 
hence the large, equal pair to an unequal, small/large pair. The contingency setting was then changed, 
the child followed the instructions to ‘leave the red button alone when the water is /ike this’, and was 
rewarded with a Smartie® from the dispenser. Large, equal pairs and small/large, unequal pairs were 
presented in the same way as the preceding stimulus pairs without any subtractions or additions of 
liquid but with many rearrangements and exchanges of either of the paired identical jars, until the 
child's performance remained consistent regardless of any (irrelevant) actions performed by the 
experimenter. The first two stimulus pairs were then gradually reintroduced until all four pairs (two 
equal and two unequal) could be freely interchanged without affecting performance. 

The training of some subjects required more than one session, in which case training sessions were 
provided on consecutive days until criterion was reached. Criterion consisted of ‘6 Smarties® in a 
row’, i.e. sıx correct consecutive responses to stimuli which included the four basic stimulus 
configurations. Since at no point during the response-training were quantities of liquid transformed 
Subjects were not taught to conserve. 


The non-verbal test of conservation of liquid quantity: 


The experimenter then withdrew reinforcement/feedback by giving the subject 10 Smarties® and 
suggesting that he should work the machine 10 more times with the Smartie® dispenser disconnected, 
since the experimenter would be 'too busy pouring the water into the jars'. The dispenser was then 
disconnected in full view of the child. Testing began with two trials, ‘equal’ followed by ‘unequal’, to 
check that the child was still performing appropriately. 

The non-verbal conservation test, consisting of three tasks and therefore six presentations, then 
followed as part of the same sequence of 10 pre-rewarded trials. In the first task, a novel pair of 
identical glass jars containing equal quantities of coloured water was presented and the subject 
responded when the cueing light was illuminated. Liquid from the left-hand jar was then poured into 
a tall, narrow glass in full view of the child. It was then presented with the remaining jar, the empty 
Jar being left in sight. The light was then again illuminated, signalling another trial and hence 
demanding a further response. In the second task, another novel pair of identical jars containing 
equal quantities of water was presented. Following the child's correct response, i.e. pressing the 
button for equal, the liquid from the left-hand jar was poured into a wide, shallow dish for 
comparison with the untouched right-hand jar. The empty jar was again left in sight, and another 
response demanded as before. Finally in the third task a further novel pair of identical jars containing 
equal quantities of liquid was presented and judged equal. The water from the left-hand jar was then 
divided equally between three small jars standing on a small tray. These were presented with the 
untouched right-hand jar for comparison, with the empty jar remaining 1n view, and another response 
agam demanded. 

Every child responded correctly to the two initial, transitional presentations following the 
disconnection of the dispenser, and to the three presentations of equal quantity preceding the three 
conservation trials, thus demonstrating that criterial performance was maintained. The order of the 
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tasks remained the same for all subjects. Each subject’s conservation status was determined by his 
performance over the three conservation trials. Only those subjects passing all three conservation 
tasks were considered as full conservers. 

An extra fourth task was also included, making up the total of 10 pre-rewarded trials, but was 
excluded from the main results. In this task, two quantities of water were judged equal and were then 
both poured simultaneously into two matched, shorter and wider, opaque containers so that children 
were unable to see the level of the water. The procedure was adapted from Frank’s screening 
experiments (in Bruner, 1966). In both the verbal and the non-verbal condition this caused more 
momentary consternation and thought than the other tasks, although every subject passed it 
non-verbally. It was failed in the verbal condition by only two children in Expt 1 and by six children 
1n the pilot study. This task was omitted from subsequent experiments. 


The verbal test of conservation of liquid quantity 


The child was taken to the experimental room and set the same three tasks ın the same order, using 
the traditional verbal interrogation procedure, the response unit and the reinforcement dispenser 
having been removed. Two identical glass jars with equal quantities of coloured water were placed in 
position as before, but in this condition the child was asked ‘Do these two glasses have the same 
amount of water in them, or does this one have more water in it, or does this one have more water in 
it?’ This question was chosen from the many and various test sentence structures reported in the 
literature as a typical form, i.e. neither the simplest nor the most complicated question form 
employed. Whereas every child had pressed the button for ‘equal’ in the non-verbal test, many 
children now answered the final part of the test question only, in line with Hood's (1962) ‘recency 
hypothesis', declaring that the last jar indicated contained more liquid. Quantities were then adjusted 
according to the child's instructions and the question repeated until he finally agreed that the 
amounts were equal. One quantity was then transformed as before and the test question repeated. 
This procedure was followed for all three conservation tasks, and again only those responding 
correctly on all three tasks were classed as full (verbal) conservers. 


Pilot study 
Subjects 


A pilot study was undertaken prior to the first main experiment using sixteen indigenous children 
(nine boys and seven girls) attending a socially mixed urban infants' school, with ages ranging from 
6:6-7:4 and a mean age of 7:0 (SD 3:39 months). Their mean standardized score on the English 
Picture Vocabulary Test (Full Range Edition - Brimer & Dunn, 1973) was 86 (SD 17-91) and their 
mean vocabulary age was 5:11 (SD 16:78 months). It was thought that as these children were less 
verbally proficient they would provide useful information regarding difficulties associated with the 
response-training procedures We felt that if we could successfully train these children we would have 
learned to overcome any potential problems likely to be experienced by the children in the main 
experiment. 


Design and procedure 


A simple repeated measures design compared the conservation performance of subjects under the 
non-verbal paradigm with their subsequent performance, the following day, under the traditional 
verbal paradigm for testing conservation. Insofar as any practice effect or incidental learning or 
sudden emergence of conservation ability would favour the second testing, the experiment was 
weighted against the first experimental hypothesis. The same procedure was followed in the pilot 
study as previously described. 


Results 


Three children conserved under both conditions and six under neither; all children conserving on the 
verbal test also conserved on the non-verbal test but seven children conserved non-verbally while 
failing to conserve verbally. The binomual test for the significance of changes (Siegel, 1956) . 
demonstrated that the number of children who ‘changed’ from non-conservers in the verbal condition 
to full conservers in the non-verbal condition was statistically significant (P « 0-01, one-tailed, see 
Table 1). 
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Table 1. Distribution of children in the pilot study classified according to performance on 
both non-verbal and verbal conservation tests 


C C = Conserver 
Non-verbal NC NC = Non-conserver 


NC C 
Verbal 


Experiment 1 


The success of the pilot study in terms of both the effectiveness of the response-training and 
the demonstration of superior non-verbal performance prompted the following main 
experiment which replicated and extended these findings. 


Subjects 


The subjects were 36 indigenous children (18 boys and 18 girls) from the same infants’ school but 
with standardized scores between 84 and 114 points on the EPVT, i.e. within the average range. 
Chronological ages ranged between 6:4 and 7.3 with a mean of 6:9 (SD 3:52 months). The mean 
EPVT standardized score was 99-19 (SD 8 97 points), and the mean vocabulary age was also 6:9, but 
vocabulary ages ranged more widely, between 5:4 and 8.5 (SD 7:07 months). 


Number of children 


Doe ww A i A - 0 MO 





Number of tasks passed 


Figure 1. A comparison of the performance of the 36 children on the verbal and non-verbal tests 
(Expt 1). O, verbal; Zl, non-verbal. 
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Design and procedure 
Design and procedure were the same as for the pilot study 


Results 


Nine children were full conservers in both conditions and 13 children failed to conserve in 
either of the two conditions. Fourteen children who did not conserve in the verbal 
condition achieved conservation in the non-verbal condition, whereas there were no 
children who failed to conserve in the non-verbal condition and who then conserved in the 
verbal condition (see Table 2). (Details of the performance of the 36 children on the verbal 


Table 2. Distribution of children in Expt 1 classified according to performance on both 
non-verbal and verbal conservation tests f 


C [i4 [ 9 | C = Conserver 
Non-verbal 
NC 113 | o | NC = Non- conserver 


NC C 
Verbal 


and non-verbal testings are shown in the histogram, Fig. 1, which gives the numbers of 
children conserving on none, one, two or three out of the three conservation tasks.) 
Significantly more children *changed' from non-conservers in the verbal condition to 
conservers in the non-verbal condition than vice versa (P « 0-001). In percentage terms, 64 
per cent of the sample demonstrated conservation in the non-verbal condition compared 
with only 25 per cent in the verbal condition. 

Performance on the verbal test was significantly correlated with verbal ability as 
measured by EPVT vocabulary age (r — 0-43, P « 0:01), whereas performance on the 
non-verbal test was not (r = 0-17, P > 0-05). In the verbal condition the mean vocabulary 
age of the conservers (7:4, SD 8-89) was significantly higher than the mean for the 
non-conservers (6:6, SD 8-97; t = 2:80, P < 0:01). In the non-verbal condition, the mean 
vocabulary age of the conservers (6:10, SD 10-89) was not significantly higher than the 
mean for the non-conservers (6:6, SD 7-40; t = 1-04, P > 0-05). 

Only 28 per cent of girls conserved verbally, 50 per cent conserving non-verbally. The 
difference was far more marked for the boys where only 22 per cent conserved verbally 
compared with 78 per cent non-verbally. The marginally superior performance of the girls 
on the verbal test is surprising in view of the higher mean vocabulary age of the boys (7:0, 
SD 9-80; girls 6:5, SD 9-03). The performance of the girls does not show evidence for a 
significant change, i.e. better performance in the non-verbal than in the verbal condition 
(P > 0-05) whereas the boys’ performance was shown to change significantly for the better 
(P « 0-001). 


Discussion 


In so far as the procedure which we have described is acceptable as a non-verbal equivalent 
to the traditional verbal paradigm employed in the assessment of conservation, we may 
claim that the results of the pilot study and the first experiment confirm our two major 
hypotheses. Significantly more young children were able to demonstrate full conservation 
under the non-verbal paradigm than in the traditional Piagetian, verbal tests of 
conservation of liquid quantity. Moreover every child who could demonstrate full 
conservation on the verbal test could also demonstrate full conservation using the 
non-verbal procedure. This second consideration is particularly important since Miller 
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(1976) notes that ‘the non-verbal procedure may itself be subject to false negatives. A 
corollary suggestion 1s that it might be profitable for non-verbal studies to attempt to 
demonstrate that children who clearly have the concept by the usual verbal measures will in 
fact pass the non-verbal test.’ 

The criterion of conservation was similar for both the verbal and non-verbal procedures. 
Neither procedure required verbal justification but both required subjects to pass all three 
tasks in order to be counted as conservers. Our percentage of verbal conservers (thus 
defined) is in line with previously reported figures for verbal conservation at this age level; 
e.g. Beard (1963), testing task three (division) only, records 21 per cent conservers in the 
age range 5:10—6:9, and 41 per cent conservers in the age range 6:10—7:9. Thus our figures 
for non-verbal performance are being compared with acceptable, typical figures for verbal 
conservation at this age. 

In passing we also found some evidence for a possible sex effect favouring the 
performance of boys in the non-verbal condition. This ties in with the findings of some 
Piagetian researchers who have reported clear but usually non-significant differences in 
performance favouring males (Beard, 1963, 1964; Goldschmid, 1967; McNally, 1971; 
Brekke, 1972; Hobbs, 1973; all cited in Modgil, 1974, or Modgil & Modgil, 1976). Other 
researchers have recorded slight or no differences (Braine, 1959; Pratoomraj & Johnson, 
1966; Shantz & Sigal, 1967; Rothenberg & Courtney, 1968; Rothenberg & Orost, 1969; 
Siegel, 1971) while Brekke & Williams (1973, cited Modgil & Modgil, 1976) demonstrated 
significantly superior performance in girls. Finally, Fogelman (1970) reported that more 
boys than girls conserved when making the transformations themselves, while more girls 
conserved when merely watching the transformations performed. 

We were also able to demonstrate the dependence of verbal conservation test 
performance on the level of receptive language development, as estimated by EPVT 
vocabulary age. Verbal conservers were shown to have significantly higher mean 
vocabulary ages than verbal non-conservers, whereas in the non-verbal condition there was 
no significant difference in mean vocabulary age between conservers and non-conservers. 
This provides confirmatory evidence for our contention that the traditional assessment 
procedures are verbally ‘biased’. We will return to this later. 


Experiment 2 


Following this successful demonstration of the paradigm in which both of our hypotheses 
were confirmed, a further replication was attempted with younger children. An additional 
modification was the inclusion in the non-verbal and verbal tests of two additional tests of 
inequality which were alternated with the three tests of equality (see Design and procedure). 
This was included as an additional control to preclude the possibility of our results being 
interpreted as being due merely to the operation of simple response sets, following Beilin 
(1976). 


Subjects 


The subjects were 22 children (11 boys and 11 girls) of mixed ability, randomly selected from the' 
second year infants in the same school as the previous study. Their ages ranged from 5:8 to 6.7 with 
a mean age of 6:1 (SD 3 37 months). The EPVT vocabulary age range was 4:9-9:1, with a mean of 
6:6 (SD 12:83 months) 


Design and procedure 


The same design was followed as in the previous experiments, but the apparatus, procedure and tests 
differed slightly. The apparatus was more compact with a more efficient reinforcement dispenser. 
There were also several small differences in the materials for this experiment compared with those 
outlined under general method. 
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Instead of manipulating the liquid out of sight, the experimenter now poured pre-measured 
quantities from opaque jugs into matched, empty pairs of vessels on the display platform in full view 
of the child. Liquid was returned to the opaque jugs between presentations. The position and 
contents of each jug were constantly varied. This method allowed the experimenter to pour liquid 
openly and continually in order to emphasis ‘amount’ while at the same time ensuring that the child 
never witnessed any transformation of liquid quantity from one shape to another. 

As mentioned previously, both the verbal and non-verbal tests were expanded to five tasks by 
including two tasks requiring judgement of inequality, interspersed between the three equality 
conservation tasks. In the first inequality task two identical jars containing unequal quantities were 
presented. Following the child's response (which was invariably correct) the larger quantity was then 
openly poured into a wider jar so that the levels of liquid ın the two jars were now the same height. A 
response to this transformed stimulus was then demanded. In the second inequality task, following 
the child's judgement of inequality, the smaller quantity was openly poured into a narrower jar so 
that the /evels of liquid in the two jars were the same height. A response to this transformed stimulus 
was again demanded. 


Results 


Four children conserved equality in both conditions and 13 failed to conserve in either 
condition. All children who conserved verbally also conserved non-verbally but five 
children conserved non-verbally while failing to conserve verbally. The number of children 
who ‘changed’ from non-conservers in the verbal condition to conservers in the non-verbal 


Table 3. Distribution of children in Expt 2 classified according to perfonnance on both 
non-verbal and verbal conservation tests 


C C = Conserver 
Non-verbal 
NC [i3 | 0 | NC = Non-conserver 


NC C 
Verbal 


condition was statistically significant (P « 0-05). Nine children (41 per cent) conserved 
non-verbally while only four children (18 per cent) conserved verbally. Five children 
conserved inequality in both verbal and non-verbal conditions and 14 failed to conserve in 
either. Three children conserved only in the non-verbal condition and zo children 
conserved verbally but not non-verbally. This difference was not significant. Eight children 
(35-5 per cent) conserved non-verbally whereas onlv five children (23 per cent) conserved 
verbally. 


Discussion 


The results of Expt 2 thus again confirm our two major hypotheses. The conservation rates 
were lower than those found in the previous study but the children were 8 months younger. 
The results from the inequality tasks confirm that our findings are not due to any simple 
response sets, since refraining from pressing is required to signal inequality. It must be 
emphasized that the results for equality and inequality are not directly comparable since 
only two inequahty tasks were tested; this weaker criterion for inequality allows potentially 
more 'chance' responders to be counted as conservers. Bearing this in mind, we found only 
small differences in conservation rate for equality and inequality in both verbal and 
non-verbal conditions. The relationship between conservation of equality and conservation 
of inequality is a confused area as a recent paper by Hardeman & Peisach (1977) makes 
clear. Different investigators have obtained conflicting results but Hardeman & Peisach 
suggest that the relationship between conservation of equality and inequality will vary 
according to whether continuous or discontinuous quantities are used and also according 
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to the age of the children assessed. We are not primarily concerned with conservation of 
inequality here, however. Our purpose was merely to use conservation of inequality tasks 
to demonstrate that our conservation of equality responses did not result from a simple 
response set. Consequently, and in view of the lack of direct comparability, we will not 
pursue discussion of inequality further here, nor will we detail results of conservation of 
inequality in the following Expt 3. Full details of these results from both experiments will 
be provided in Poborca (forthcoming). 


Experiment 3 


The third experiment was designed to extend the previous findings by (a) changing the 
order of the tests so that the verbal test was given both before and after response-training 
and the non-verbal test; (b) comparing an expanded form of response-training with our 
original training procedure; and (c) incorporating a control group given neither response 
training nor a non-verbal test. 


Subjects 


The subjects were the 66 available, indigenous, second year infants (33 boys and 33 girls) who had 
not conserved on the verbal test of liquid conservation. The infants’ school was in an urban, working 
class area and only 3 out of 79 children (3 8 per cent) conserved verbally. (This 1s well below the level 
generally reported; see, for example, Beard, 1963, who found 20:9 per cent conservers on one task for 
the identical age range ) The age range was 5:10—6:9 with a mean age of 6.4 (SD 3 45 months); their 
EPVT vocabulary ages ranged from 4:8 to 8:4, with a mean of 6 1 (SD 10:00 months). 


Design and procedure 


The three randomly assigned groups were randomly allocated to three conditions. The first group 
was given tlie response-training with the materials described in the last experiment. The second group 
was given an expanded response-training which had previously been tried out on a small number of 
children 1n another school. Finally the third group was given neither training nor a non-verbal test. 
Since none of these children could conserve verbally on the initial verbal testing, this design allowed 
us to examine the effects of non-verbal training and testing on subsequent verbal test performance, as 
well as comparing the effects of two different forms of training. 

In the expanded response-training, the basic training (already outlined) was followed by 
presentations designed to emphasize the comparison of ‘amount’ (quantity) and to discourage 
comparisons on the basis of similarity of (a) the liquid, (b) the jars, and (c) (most importantly) the 
height of liquid level. Since the shape of the liquid could not be altered without teaching 
conservation, liquid and jars of different colours were compared while liquid level and liquid amount 
were contrasted by a method developed for deaf children by Furth (1966). 

These presentations consisted of: 

(a) two equal quantities of differently coloured water; 

(b) two equal quantities of liquid 1n transparent vessels of identical size and shape but of different 
colours; 

(cl) two equal amounts of water in identical jars, one of which was then placed on a wooden 
block. This raised its liquid level ın relation to the other vessel but left its quantity and shape 
unchanged. These presentations were repeated with different-shaped jars and with different sizes and 
numbers of blocks, alternated with unequal presentations, until the child's responses were consistently 
correct. 

(c2) two unequal amounts of liquid in identical jars; the smaller of which was then placed on a 
wooden block so that the liquid surfaces in the two jars were on the same level (1.e. the same height 
above the table-top) while the shape and quantities of the liquids remained unchanged, i.e. unequal. 
These presentations were repeated with different-shaped jars and different sizes and numbers of 
blocks, alternated with equal presentations, until the child's responses were consistently correct. 
Verbal and non-verbal testing followed the same pattern as in the previous experiment. 
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Results 


Seven out of 22 verbal non-conservers (31-8 per cent) who were given the basic response 
training, conserved non-verbally and six out of 32 verbal non-conservers (27-3 per cent), 
who were given the expanded response training, conserved non-verbally (see Table 4). In 
the second verbal test of conservation of liquid quantity, none of the 22 control children, 
given neither response-training nor non-verbal testing, conserved verbally; similarly, none 
of the 22 children given the basic (original) response-training and non-verbal test of 
conservation of liquid quantity conserved verbally; however, three out of the 22 children 
(13:6 per cent) given the expanded response training and non-verbal test of conservation of 
liquid quantity now conserved verbally. These three children had all conserved 
non-verbally. Thus, three out of six (50 per cent) of the non-verbal conservers who had 
been given this expanded programme of response-training changed from verbal 
non-conservers to verbal conservers (see Table 5). In the ‘original training’ condition, 
significantly more children ‘changed’ from non-verbal conservers to verbal non-conservers 
(P < 0:01) whereas in the expanded response-training condition, the ‘change’ was not 
significant (P > 0-05). 


Table 4. Distribution of children in groups 1 and 2 of Expt 3 classified according to 
performance on both non-verbal and first verbal conservation tests. 


C [7 |o] C = Conserver C elo) 
Nonverbal NC NC = Non-conserver Non-verbal NC [16 | 0 | 


NC C Group! NC C JGroup2 
Verbal Verbal 


Table 5. Distribution of children in groups 1 and 2 of Expt 3 classified according to 
performance on both non-verbal and second verbal conservation tests 


Group ! Group 2 


C L3 157] C = Conserver C 
Menu NC (is | o | NC =, Non-conserver Sere NC 16 | o | 


NC C NC C 
Verbal Verbal 


Discussion 
Experiment 3 confirmed and extended our previous findings. Again we showed that a large 
proportion of children who could not conserve in the traditional, verbal procedure were 
able to demonstrate conservation using our non-verbal method, and that no child 
conserved verbally without also conserving non-verbally. We were also able to eliminate 
any possibility of our results being due to any simple order effects since verbal testing both 
preceded and followed non-verbal testing. Nor was our method shown to train 
conservation since none of the children in group 1 (trained by our original method) 
conserved on the verbal post-test. Similarly, the results from the control group (with no 
children conserving on the verbal post-test) showed that we could expect no 'spontaneous' 
improvement due to practice, to emergence of conservation during the 5-week period 
between the two verbal testings, or to children's changing responses on subsequent 
requestioning. 

The results concerning the expanded response-training are however, less conclusive. 
Non-verbal performance was not shown to be significantly superior to subsequent verbal 
performance since half of the non-verbal conservers also conserved verbally on the verbal 
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post-test. In view of the fact that we could expect no ‘spontaneous’ improvement, we are 
forced to consider the possibility that the expanded form of response-training either trained 
conservation or provided children who were near-achieving conservation with sufficient 
new evidence to become conservers. Training a child to ignore liquid level in judging 
quantities may be seen as providing training in a subskill essential to conservation. For this 
reason, we are led to suggest that our original response-training method is preferable at this 
stage, since it is not open to this criticism. 


An additional study on verbal conservation 


Whilst carrying out this programme of experiments, data were collected on a large number 
of children of varying ages who were given the traditional verbal conservation test (in the 
manner previously described) and who were also tested on the EPVT. We have complete 
data on a sample of 275 children, the results from which provide further evidence for the 
dependence of verbal conservation on level of receptive language development. The sample, 
consists of 138 boys and 137 girls with CA ranging from 5:2 to 7:9, mean 6:8 (SD 6-14 
months). The vocabulary age range was much wider, 3:6—9:8, but the mean was very 
similar, 6:6 (SD 11-35 months). Of the 275 children, 58 (21 per cent) were full verbal 
conservers. Figures for the number of subjects passing the individual tasks were very 
similar: task 1, 69 (25 per cent); task 2, 73 (26:5 per cent), task 3, 77 (28 per cent). CA 
correlated with verbal conservation (pass/fail) much less than did VA with verbal 
conservation (0-28 vs. 0-44). Thus, VA accounted for over twice as much of the variance 
(19 per cent) compared with CA (8 per cent). (Very similar results obtain regardless of 
whether a pass/fail criterion or number of tasks passed was employed, since the two 
measures correlate highly, 0-92.) Conservation pass rates for both CA and VA in 
succeeding years of age are given in Table 6. This shows the greater discriminative and 
predictive power of VA, especially in the period from 6 to 8 years. Verbal conservation 
ability is thus shown to be particularly dependent upon receptive language ability in line 
with our theory. 


Table 6. Number and percentage of children conserving (verbally) in one year age bands for 
CA and VA. 

















25 6 7 >$ 8+ 

CA 

n 30 166 79 

Pass l 30 27 

% 3 33 18-07 34-18 
VA 

n 15 60 103 84 13 

Pass 0 2 10 39 7 

yA 0 3:33 971 46:43 53-85 
Final discussion 


In this series of experiments we appear to have demonstrated successfully that certain 
cognitive skills are acquired earlier than Piagetian theory would have us believe and that 
traditional conservation assessment procedures may be verbally biased. We have 
demonstrated that a non-verbal assessment procedure consistently produces higher pass 
rates for children tested for their conservation of liquid quantity compared with their 
performance as assessed by traditional verbal procedures. Moreover, we have demonstrated 


5 psy 7] 


130 Kevin Wheldall and Barbara Poborca 


that all children who conserved non-verbally could also conserve verbally. This evidence, 
alongside our correlational data showing the strong relationship between verbal 
conservation performance and receptive language development, suggests that traditional 
assessment procedures are verbally biased, requiring, in effect, the child to answer two 
questions, as Herriot (1969) has suggested; ‘the first is to understand what the test question 
is, the second is to answer it. An incorrect response may be a failure to understand the 
question. . .'. A non-verbal procedure, such as the one we have described, makes sure that 
the child knows what he has to do and hence the results are a more reliable indication of 
his real level of cognitive development. 

Before considering the theoretical implications of our work it is worth reviewing briefly 
what we may eliminate as explanations of our results. They are not due to an order effect 
since verbal conservation has been shown to be consistently inferior to non-verbal 
conservation, whether it precedes or follows response-training and non-verbal assessment. 
There is no evidence to suggest that the standard (original) response-training procedure 
trains conservation since no transformations are made during training and since subsequent 
verbal performance is not improved. Nor are our results due to a set developing to press 
the button (for equal) regardless of stimulus array in the test trials, since in later studies 
when unequal quantity trials were presented in between equality test trials, children 
appropriately refrained from pressing the button. In a subsequent experiment a further 
precaution is taken whereby half of the subjects are trained to press the button, and half 
are taught to refrain from pressing the bution, to signal equal quantity (Poborca, 
forthcoming). 

From this refutation of alternative explanations of our results, we next turn to a 
consideration of what our non-verbal approach may fairly be claimed to achieve in 
combatting the various ambiguities and confusions inherent in the traditional verbal 
paradigm for the assessment of conservation (as detailed in the introduction). We would 
argue that it avoids or reduces the influence of the following: 

(1) failure to comprehend fully words such as ‘more’, ‘same’, and ‘amount’; 

(2) the contextual expectation of a changed response being necessary when a question is 
repeated following a change in the stimulus array; 

(3) primacy or recency effects whereby the child opts consistently for the first or last of 
several options made available to him (e.g. he says the two quantities are ‘the same’ since 
this was the last choice offered to him); 

(4) overindulgence in affirmative ‘yes, yes’ replies to the tester’s questioning; 

(5) the requirement of verbal justification of responses and its associated problems. 

On the other hand, however, it must be admitted that the non-verbal paradigm, in failing 
to specify manifestly that amount is the criterion, introduces three new, potential, problems: 
(a) the confusion of same liquid with same amount of liquid; (b) the confusion of liquid 
level with amount of liquid; and (c) the confusion of same containers with same amount of 
liquid. These problems may be shown to be of less importance than they might at first 
appear. 

Readers of our procedure might have formed the impression that our non-verbal tests 
could be passed by a child who only thought that the /iquid rather than the amount of 
liquid was conserved, resulting in many ‘false positives’. During our response training 
procedures, however, it is demonstrated that sameness of liquid is, 1n a sense, irrelevant 
since the child has to press the button when presented with equal quantities of liquid, and 
would again have to press when subsequently presented with two equal quantities of liquid, 
one of which had overtly been changed for an equal quantity of different liquid. Thus, 
sameness or not of liquid is not learned as a relevant cue as to when to press or when not 
to press the button. 
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The second potential confusion is more problematical, however, since it appears to be 
impossible to differentiate between level of liquid and amount of liquid within the response 
training, without teaching the task. This was demonstrated in our attempt to improve our 
training by the addition of trials designed to teach the child to avoid comparisons based on 
the height of liquid level. We concluded that teaching a child to ignore liquid level might be 
seen as providing training in a subskill essential to successful conservation. (On this basis, 
Furth's, 1966, work, on conservation in deaf children might be suspect insofar as our 
expanded training was, in part, based on his procedures which employed blocks to vary 
liquid level.) 

Thus we cannot conclude, on the basis of our advocated (original) training procedure, 
that our method successfully differentiates between level and amount of liquid. On the 
pragmatic level, however, we can point to the fact that our evidence does demonstrate that 
more children do base their judgements on quantity in the non-verbal paradigm than in the 
verbal, i.e. although we do not specify quantity (as against level) the children appear to 
respond on the basis of quantity whereas in the verbal procedure quantity is specified and 
yet most children of this age appear to base their judgement on level. We might go further 
than this, even, and claim that since our procedure allows for judgements based on level, 
then our non-verbal conservation pass rates are conservative (since it 1s possible that some 
of the children responding on the basis of level could have responded on the basis of 
quantity). In other words, confusion of level and amount works against our hypothesis by 
possibly allowing too many false negatives. 

The third potential confusion, of same containers with same amount of liquid, is of 
minimal concern. Piaget & Inhelder (1969) mention that deaf children *must be made to 
understand that the questions have to do with the contents of the containers and not with 
the containers themselves’. In our studies, however, this did not appear to present any 
difficulty. If the children had based their judgement upon sameness of containers they 
would have refrained from pressing the button (for equality) in the conservation trials and 
would have (perhaps falsely) been recorded as non-conservers. Such an occurrence would 
be unlikely, however, as the earlier training would have demonstrated that sameness of 
container was an irrelevant cue. 

Our demonstration that children who cannot conserve on verbal Piagetian conservation 
tasks may conserve on the same tasks when presented non-verbally may be seen, in one 
sense, as being in complete accord with Piaget's beliefs that the young child begins to 
coordinate his cognitive actions 'even prior to language acquisition so that one sees a kind 
of logic-of-action' (Piaget, quoted in Flavell, 1963). * There is a logic of the coordination of 
action. This logic is more profound than the logic attached to language' (Piaget, 1969). In 
his view, language translates what is already understood so that children can easily 
understand the question when it does not go beyond the concepts they have already formed 
through action; children who do not understand the relationships of ‘more’, ‘same’ and 
‘amount’ therefore cannot understand the words embodying these relationships. Their 
failure to understand the relationships on which the concept of conservation depends will 
be reflected in a failure to understand the instruction. But the reverse would not be true. 
Children who understand these relationships in the real world would not necessarily 
understand the words that encode them or the syntax of the questions which ehcit them. 

Piaget feels that possible failure to realize that the question is about total quantity is 
similar to the problem of deceptive perception; that the child who is truly operational will 
‘by the use of intelligence’ focus on the relevant characteristics for quantity conservation 
and thus demonstrate conservation; and that the fact that children may isolate an 
irrelevant characteristic may be ‘due as much to lack of understanding of the notions in 
question as failure to grasp the verbal question’ (Piaget, 1952). ‘Piaget holds that lack of 
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comprehension of these terms (more, longer, same) indicates that the child has not 
assimilated this knowledge to the appropriate cognitive structure. Therefore, such lack of 
comprehension in itself is an indicator of cognitive level' (Sigel & Hooper, 1969). But 
comprehension of the questions Piaget poses, in assessing conservation, requires mastery of 
the necessary receptive verbal skills which are additional to and distinct from, even though 
they may be partly dependent upon, the cognitive skill which is being investigated, 
Children comprehending the dependent physical relationships may attain the concept of 
conservation without yet understanding their expression in words. Later, in the context of 
Stage II, Piaget suggests an alternative, more plausible, argument when he discusses his 
findings that a child at a certain point in development conserves liquid quantity in some 
situations but, when the discrepancy in liquid level becomes too great, ‘falls back on his 
earlier belief in non-conservation' (1952). This leads Piaget to conclude, 'If the child 
hesitates, 1f he gives the correct answer when the variations are slight but does not assume 
conservation when the variation in shape is greater, it is obvious that he understands the 
question but is not convinced a priori of the constancy of the whole quantity' (1952). 

How can we explain this in the context of the present findings? Firstly, we found very 
little evidence for this intermediary stage in either verbal or non-verbal conservation 
though one might expect this in the learning of any skill. Secondly, it could be argued that 
our transformations did not produce sufficiently large discrepancies in liquid level and were 
thus not a sufficiently rigorous test. We can reply that, in our opinion, the three tasks 
provided three clearly distinct examples of gross discrepancies in level. It is possible that we 
found so few ‘transitional’ or ‘intermediary’ children since our tasks were so rigorous and 
did, 1n fact, only ‘pick up’ real, full conservers. 

The most likely reconciliation between Piaget's findings and our own lies in a 
consideration of the interaction between verbal questioning and context/expectancy. A 
child, in developing receptive language, does not merely learn how to ‘decode’ sentences 
but also learns the more socially sophisticated skills which require an allowance to be made 
for non-verbal cues such as facial expression, tone of voice, etc., and, in particular, context 
and expectancy. Given that a child can conserve non-verbally, he may also be able to 
demonstrate conservations verbally under certain conditions, i.e. if the transformation 
produced does not result in gross discrepancy in level. But a similar question posed when 
the discrepancy is great may lead him to suspect, as a result of contextual cues and 
expectancy, that the assessor must mean ‘level’. Thus a child may understand that ‘more’ 
refers to amount in some contexts but not in others. Sinha & Walkerdine (1978) refer to 
situations where ‘the social “logic” of the context — that is, the rules and conventions 
governing everyday discourse — takes precedence over the formal logic of the problem’. A 
non-verbal procedure may reduce the importance of the social logic in conservation 
assessment situations, thereby making clear to the child what we want him to base his 
judgements on. 

Thus our arguments 1n favour of non-verbal assessment procedures, of the type we have 
described, are based upon two major considerations. Firstly, children may not understand 
the grammatical or semantic forms employed in the verbal interrogation procedures. 
Secondly, even if they can understand these forms in certain contexts, the context of the 
traditional Piagetian verbal assessment procedure may be such that the 'social logic' 
‘swamps’ the ‘formal logic of the problem’. In this latter case verbal assessment of 
conservation may be more a measure of ‘contextual assumption breaking’ than of logical 
skills. 

Finally, it must be emphasized that although we have applied this paradigm only to the 
concept of conservation of liquid quantity, it is equally amenable to other conservations 
(such as length, number, substance) as well as other Piagetian concepts, thus providing 
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assessments of a child’s cognitive development which would be relatively unaffected by 
factors such as verbal understanding or fluency and linguistic experience. It might, 
therefore, prove especially useful in assessing the cognitive development of handicapped 
children, children from disadvantaged or from minority culture backgrounds or any other 
groups with retarded (receptive) language development. 
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Factor stability of the Edinburgh Handedness Inventory as a function of 
test-retest performance, age and sex 


K. McFarland and J. Anderson 





The Edinburgh Handedness Inventory (EHI) was administered to 600 students from two urban 
secondary schools and two universities. The respondents were categorized by age (three groups) and 
sex. The questionnaire was re-administered to a sample of 69 respondents after an interval of 4 weeks. 
Principal axes factor analysis of the test and retest data showed the EHI to have a single factor which 
exhibited a high stability over the test-retest time period. However, one item (the use of scissors) was 
relatively unstable in relation to the derived (handedness) factor. A simple factor structure was also 
extracted from each ‘age x sex’ data set. The handedness factor showed a high stability across both 
age and sex. However, three items (knife, broom and box-lid) did not load well on the handedness 
factor and were generally unstable in relation to this factor. When assessing handedness with this 
questionnaire it is recommended that responses to particular items be weighted according to their 
stability and their contribution to the handedness factor. Alternative scoring procedures would 
seriously affect the otherwise high validity that this questionnaire has for assessing handedness. 





Any investigation of cerebral and behavioural asymmetries usually requires some 
assessment of handedness. For reasons of ease and quickness of administration, the 
‘handedness questionnaire’ is often used for this purpose (see Annett, 1970; Oldfield, 1971; 
Touwen, 1972). Given the wide use of such questionnaires, it is somewhat surprising that 
only a limited number of studies have examined their factor (construct) validity. One 
questionnaire which has received some attention in this regard is the Edinburgh 
Handedness Inventory (EHD (Oldfield, 1971). White & Ashton (1976), using a slightly 
modified form of the EHI, found the questionnaire to have a simple factor structure with 
one major factor which could be called handedness and a second minor factor unrelated to 
handedness. A more recent study by Bryden (1977) yielded similar results. 

No investigator has, to date, supplied evidence that this factor structure is maintained 
between test and retest performance, across age and between the sexes. Bryden (1977) 
mentioned that ‘preliminary calculations revealed approximately the same factor structure 
for men and women’, but specific details were not reported. McMeekan & Lishman (1975) 
reported on the test-retest reliability of the EHI in terms of changes in the response score 
distributions. However, it is not known whether the variability in response distributions 
was due to an instability of the factor structure of the questionnaire, or due to variability 
arising from sources other than handedness (see also Raczkowski et al., 1974; Coren & 
Porac, 1978). The present study extends these earlier reports and is specifically concerned 
with examining the factor stability, and hence the construct validity, of the EHI as a 
function of test-retest performance, age and sex. 


Method 


The EHI consists of 10 items which may be summarized as follows: (1) writing, (2) drawing, (3) 
throwing, (4) scissors, (5) toothbrush, (6) knife, without fork, (7) spoon, (8) broom (upper hand), (9) 
striking a match (match), and (10) opening box (lid). Respondents are required to indicate their hand 
preference and strength of preference by placing a + or a + + beside each item in columns marked 
left and right or to indicate indifference by placing a + in each column. Scoring is achieved in the 
present study by allocating numbers 1-5 (strong left to strong right) for each of the possible types of 
response. While this scoring procedure differs from that proposed by Oldfield (discussed below), it 
enables an analysis of the factor structure to be undertaken (see also White & Ashton, 1976; Bryden, 
1977). 

The questionnaire was administered to 600 students from two urban high schools and two 
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universities. Generally the regular class teacher/lecturer administered the questionnaire to the 
students in groups varying in size from 30-70 in number. The test administrator was instructed to 
ensure that the student read the questionnaire carefully and also to answer any queries raised by the 
respondents 

The respondents ranged in age from 12-50 years and were categorized, on an a priori basis, into 
three age groups, under 15, 15-17 and over 18 years of age. Sixty-nine of the respondents from the 
over 18 years age group were re-administered the questionnaire, by the same person, after an interval 
of 4 weeks. 


Results 
Test—retest performance 


Table 1 presents the response frequencies for particular items on the EHI as a function of 
test-retest performance. On this table the ‘off-diagonal’ entries give the frequencies with 
which responses changed from test to retest. Items 5 (toothbrush), 8 (broom), and 9 
(match), accounted for most of the changes in initial ‘right’ responses, and items 3 
(throwing) and 10 (box-lid) for changes in initial ‘indifferent’ responses. Most of these 
indifferent responses shifted in the direction of right responses. Item 8 (broom) accounted 
for most of the changes in initial ‘left’ responses. 

Only 65 per cent of all responses remained unchanged across test sessions (diagonal, 
Table 1). However, if only the broad categories of ‘left’, ‘indifferent’ and ‘right’ are 
considered (i.e. collapsing + and + + responses), this instability is seen to be more of 
degree than kind. With collapsing 86 per cent of responses remained in these broad 
categories of left, indifferent and right. 

To determine the source of the instability exhibited in the response distributions a 
principal axes factor analysis was performed on each of the test and retest data sets. In 
both cases only one factor was extracted (only factors with eigenroots greater than unity 
were considered; Gorsuch, 1974). The factor loadings and the variance accounted for by 
each factor are presented in Table 2. 

The principal concern here is the consistency of the factor loadings across the two 
analyses, rather than which items load high or low per se. Item 4 (scissors) stands out in 
that it loaded well on the ‘test’ factor matrix, but not on the ‘retest’ matrix (lowest 
loading). Items 2 (drawing) and 10 (box-lid) also show some discrepancy in their respective 
loadings. 

These discrepancies in factor loadings raise the possibility that, while a single factor was 
extracted from both sets of data, the factors identified may not be the same. To examine 
this possibility, the factor scores for each respondent were derived for the test and retest 
conditions. The correlation between these derived scores was r — 0:91 which strongly 
indicates that the (handedness) factor and individual's ‘handedness’ remained unchanged 
over the test-retest conditions (Gorsuch, 1974). This, of course, does not preclude the 
possibility that item 4 is unstable with respect to handedness, but it does indicate that when 
the weighted responses to all items are considered the instability of item 4 is minimized. As 
a general conclusion, the variability evidenced in response frequencies (Table 1) must arise 
from sources other than handedness and when factor scores are used the EHI exhibits a 
high test-retest validity. 


Effects of age and sex 


Table 3 presents the distributions of responses (as percentages) on the EHI items as a 
function of age and sex. Significant age and sex differences in these response distributions, 
as determined by x° tests, are indicated on that table. Items 3 (throwing) and 4 (scissors) 
showed significant sex effects; and age had an effect on all distributions with the exceptions 
of items 1 (writing) and 2 (drawing). As with the foregoing test-retest analysis, the question 
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Table 1. Distribution of responses on handedness items as a function of test-retest 


performance 


` Retest performance 























Test Item Strong Weak Left or Weak Strong 
performance number left left right nght right 
Strong left 1 5 0 0 0 0 
2 5 0 0 0 0 
3 3 0 0 0 0 
4 2 0 0 0 0 
5 4 0 0 0 0 
6 2 0 1 0 0 
7 2 l 0 0 0 
8 I 2 0 0 0 
9 2 0 0 0 0 
10 1 1 0 0 0 
Total (% row) 21 (84-492) 4 (12 592) 1192 0 0 
Weak left 1 l 0 0 0 0 
2 1 0 0 0 0 
3 1 0 0 0 0 
4 0 0 1 0 0 
5 0 1 0 0 0 
6 0 3 0 0 0 
7 0 1 0 0 0 
8 0 6 1 2 l 
9 1 2 1 1 0 
10 l 3 2 1 0 
Total (% row) 5 (161% 16 (51-129) 5 (16 1%) 4 (1299 1G2% 
Left or night l 0 0 0 0 0 
2 0 0 0 0 0 
3 0 0 5 5 1 
4 0 I 6 0 0 
5 0 0 6 4 0 
6 0 0 5 0 1 
7 0 0 6 2 I 
8 0 0 13 3 I 
9 0 0 8 0 0 
10 0 1 20 6 2 
Total (9, row) 0 2 (21%) 69 (71 1%) 20 (20 6%) 6 (6:2%) 
Weak nght 1 0 0 0 4 3 
2 0 0 2 9 7 
3 0 0 4 15 13 
4 0 0 2 14 10 
5 0 0 11 12 8 
6 0 0 3 16 it 
7 0 0 6 26 9 
8 1 l 3 16 7 
9 0 0 6 19 8 
10 0 0 5 16 5 
Total (9 row) 1 (0:495) 1 (0:49) 42 (154%) 147 (540%)  81(29 892) 
Strong right 1 0 0 0 8 48 
2 0 0 1 7 37 
3 0 0 0 5 17 
4 0 0 2 8 23 
5 0 0 2 8 12 
6 0 0 2 6 19 
7 0 0 0 4 10 
8 1 I 3 16 7 
9 0 0 6 19 8 
10 0 0 I 2 2 
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Table 2. Factor loadings from principal axes factor analyses of test and retest data sets 








Test factor Retest factor 

Item number matrix matrix 
1 Writing 0-84 0-79 
2 Drawing 0-89 0-80 
3 Throwing 0-66 0-68 
4 Scissors 0-73 0-62 
5 Toothbrush 0:76 0:75 
6 Knife 0:85 0-81 
7 Spoon 0-85 0-81 
8 Broom 0-64 “0-69 
9 Match 0:86 0-87 
10 Box-lid 0-62 0-69 
Variance (77) 63:9 61:4 





Table 3. Distribution of responses on handedness items as a function of sex and age (%) 


Male response category Female response category 





Item 
number 1 2 3 4 5 1 2 3 4 5 








Age: under 15 years (males, n = 91; females, n = 95) 


1 11:0 1-1 11 198 670 74 3:2 11 189 69-5 

2 11:0 11 44 209 626 74 2:1 42 168 69-5 

3 * 71 00 297 297 51-6 5:3 00 295 295 368 

4 6:6 11 220 253 451 74 21 116 211 579 

5 71 33 330 209 352 53 32 411 242 26-3 

6 6:6 44 176 286 42-9 53 63 158 200 526 

7 88 44 275 209 385 42 21 274 232 43-2 

8 12:1 LI 275 275 319 42 42 337 242 337 

9 5:5 44 242 242 418 42 53 31-6 232 358 
10 6:6 55 505 110 264 63 63 484 253 13-7 
Age: 15-17 years (males, n — 117; females, n — 116) 

l 6:8 51 09 188 684 52 3-4 34 31.9 56-0 

2 6-0 6.0 26 214 64-1 2:6 43 86 276 56-9 

3 "st T6 94 103 368 41-0 0-9 34 259 50:0 19-8 

4 0-9 34 256 47:0 231 1:7 26 155 41-4 388 

5 2:6 51 333 40:2 188 0-9 66 302 397 233 

6 34 5-1 128 470 31-6 00 9-5 14-7 483 276 

7 1-7 43 248 462 23-1 0-9 11-2 19-9 48:3 198 

8 1-7 94 410 29-1 18-8 0-9 112 33-6 388 15-5 

9 0-9 43 35:0 419 179 09 69 379 397 14-7 
10 0-0 60 53:0 291 12-0 0-9 69 543 328 5:2 
Age: over 18 years (males, n = 85; females, n = 96) 

l 4-7 2:4 0-0 17-6 753 42 1-0 00 198 75:0 

2 47 1:2 24 31:8 600 42 1-0 10 292 646 

3 3:5 3-5 98 494 341tT, 2:1 10 208 51:0 2501 

4 * 12 24 106 529 32-9ttt 21 0:0 63 385 53-1ff 

5 3-6 36 119 512 29-8ttt 2-1 10 240 510 21-9tt 

6 3-5 3-5 82 57:6 37-7t 31 1-0 94 469 39-6ftt 

7 2-4 1-2 119 61:9  226ftf 2-1 10 208 583 177111 

8 24 119 357 429 TAttt 2-1 74 326 442 13-711 

9 24 59 176 58:8  153ttt 24 l1 158 5.6 29511 
10 24 721 529 329 47ttt 0-0 63 427 458 5-2łł 





*— P e ANS 9 P< NANAI (Item x sex. within ape). 
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Table 4. Stability of factors and items across age for males and females 


Factor 
relationships Item relationships 








Comparison I n 1 2 3 4 5 6 7 8 9 10 
Males 

Under 15 1 0-99 0.07 076 077 0:97 098 097 099 0-98 0:93 093 0-94 
vs. 

15-17 H — — 

years 


Under 15 I 0-99 0-08 0:90 092 093 099 0-99 0:97 099 0-84 0-98 0-90 


vs. 
over 18 H — — 

years 

Females 

Under 151 099 | —0-04 0:99 099 0:99 0:99 099 081 099 0-94 099 099 
vs. 

15-17 H 0-04 0-99 

years 

15-17 I 099 — 0-83 0:89 0:99 0-99 0:99 099 0:99 0:97 0:97 0-82 
vs. 

over 18 II O11 — 

years 

Under 151 099 — 0-91 092 0:99 0:99 0:99 085 0:99 0:83 097 0-89 
vs. 

over 18 I 005 — 

years 


—, No second factor. 
Relatively unstable items are italicized. 


is whether the variability in response distributions is due to genuine group differences in 
handedness, or due to variability arising from other, non-handedness factors. 

Principal axes factor analysis was performed on each of the six ‘age x sex’ groups with 
the criterion that only factors with eigenroots greater than unity be retained (Gorsuch, 
1974). A simple factor structure consisting of one or two factors was obtained for each 
group. The method of maximizing the congruence of corresponding test item vectors was 
used to examine the relationships among the item-vectors and among the factor structures 
obtained (see Gorsuch, 1974 and Veldman, 1967 for discussion). Tables 4 and 5 present the 
results from the pair-wise comparisons of each of the factor structures obtained from the six 
analyses. As shown on these tables the cosines (correlations) between corresponding factor 
axes all exceeded 0-98 which attests to the high factor stability of the EHI as a function of 
both age and sex. Generally, the item-vector cosines (correlations) are also relatively large 
indicating a high stability of responses to particular items across the groups. However, 
there are some exceptions. Chief among these are items 6 (knife, without fork), 8 (broom) 
and 10 (box-lid). On some comparisons, items | (writing) and 2 (drawing) also exhibit 
some instability relative to the other items. 
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Table 5. Stability of factors and items across sex for each age level 











Factor 
relationships Item relationships 
Comparison I II 1 2 3 4 5 6 7 8 9 10 





Under 15 years 

Males I 099 — 0-91 0-92 0:99 0-99 0:99 0-85 0-99 083 097 0-89 
vs. 

females II 0.05  — 


15-17 years 

Males I 099 —004 0:99 0-98 0:92 0-96 0:95 0:97 094 0-99 0-99 0-97 
v8. 

females II 0-04 0-99 

Over 18 years 

Males ] 099 008 090 0.92 093 099 0:99 0:97 099 0:84 098 090 


vs. 
females I] — — 








—, No second factor. 
Relatively unstable items are italicized. 


Because the factors identified are essentially identical, all the factor loading matrices are 
not presented here. However, the factor loadings obtained from an overall principal axes 
factor analysis are presented in Table 6 and these loadings are representative of those 
obtained from each of the six ‘age x sex’ analyses. In those cases where two factors were 
obtained (Tables 4 and 5), one factor could easily be identified as the handedness factor 
and the other, when it occurred, was peculiar to items 6, 8 and 10 (those that load least on 
the handedness factor, Table 6). 


Table 6. Factor loading matrix from principal axes factor analysis of combined data 
(n = 600) 








Item number Factor loading 
1 Writing 0-82 
2 Drawing 0-83 
3 Throwing 0-73 
4 Scissors 0-69 
5 Toothbrush 0-72 
6 Knife 0-63 
7 Spoon 0-77 
8 Broom 0-59 
9 Match 0-75 

10 Box-lid 0-55" 

Variance (72) 55:6 








The instability of items 6, 8 and 10 is not of great importance because they load least on 
the handedness factor (Table 6). However, the occasional instability detected for items 1 
and 2 is of some concern as these items contribute to the ‘definition’ of handedness as 
measured by this questionnaire. Nevertheless, the very fact that these items consistently 
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load highly on the handedness factor is sufficient for the instability of these loadings not to 
greatly affect the meaning of the extracted handedness factor across age and sex. This, 
taken in conjunction with the large correlations among the handedness factor axes, assures 
that the same factor was identified for each group and that the EHI was measuring the 
same thing (handedness) in all cases. 

As a final check that handedness was constant across age and sex, a two-way (unequal n) 
ANOVA was conducted on the factor scores derived from the overall principal axes 
analysis (Table 6). In this analysis no significant age, sex or age x sex effects were found. 

It is concluded that the differences observed in the response frequency distributions 
(Table 3) are not due to differences in handedness, or to an instability of the EHI in 
measuring handedness, but due to variability arising from other, non-handedness sources. 


Analysis based on laterality quotients 


Oldfield (1971) scored the EHI by adding the number of responses in each (left and right) 
column, subtracting these sums, dividing by their total and multiplying by 100. The score 
obtained, the laterality quotient (LQ), could range from +100 to — 100. In the present 
study a significant correlation was found between the test-retest LQ scores (r = 0-90). This 
indicates that the EHI has high reliability (cf. McMeekan & Lishman, 1975). However, 
because LQ scores are derived by giving equal weighting to all items, and because unequal 
factor loadings were obtained in the foregoing factor analyses, this index must be 
considered susceptible to variation from sources other than handedness. 

A two-way (age x sex), unequal n, ANOVA performed on the LQ scores from the present 
sample of subjects revealed no significant effects due to sex or age x sex. However, there 
was a significant age effect (F — 4-95, d.f. — 2,596, P « 0-01). This is contrary to the results 
obtained from an equivalent analysis on the factor scores (above). Apparently LQ scores 
are seriously affected by ‘non-handedness’ factors, particularly across age. 


Discussion 


The present results confirm that the EHI has a high factor stability and hence validity, as a 
function of test-retest performance, age and sex. However, a few items load poorly on the 
extracted ‘handedness’ factor. These were, items 6 (the use of a knife, without fork), 8 (the 
use of a broom) and 10 (the lifting of a lid on a box). These items were also the most 
unstable across age and sex. The use of scissors (item 4) was the most unstable item in 
relation to the handedness factor for the test-retest comparisons. 

The instability of these items with respect to handedness probably arises from a number 
of sources. White & Ashton (1976) and Bryden (1977) suggest the 1tems which load poorly 
on the handedness factor refer to manually ambiguous activities which occur with relatively 
low frequency and which require careful thought on the part of the respondent. Items 8 
(broom) and 10 (box-lid) would fall into this category, but item 6 (knife, without fork) is 
not so easily interpreted. However, there may be some ambiguity arising out of the 
‘without fork’ stipulation made.in the item. The test-retest instability of item 4 (scissors) is 
perplexing 1n that this activity would be relatively familiar to the respondents and the item 
appears not to be ambiguous. However, it is interesting to note that Annett (1970) found 
the scissors item on her questionnaire could discriminate amongst respondents only after 
they had been classified as left-handed by other items. The scissors item failed to 
discriminate amongst those respondents with tendencies to right-handedness. In 
conjunction with the present results, it is possible that those in the latter category have 
difficulty in determining their strength of preference on the scissors item in a manner which 
is consistent with the use of the response scale for the remaining items. This may account 
for the test-retest instability of this item on the handedness factor; this factor being 
identified from the responses to the remaining items. 


142 K. McFarland and J. Anderson 


The validity of the Oldfield questionnaire for assessing handedness would depend upon 
allocating appropriate (small) weights to the forementioned items when scoring the 
questionnaire. When factor scores cannot be derived it is recommended that the above 
items (4, 6, 8 and 10) not be scored and that a total score based on the remaining items be 
adopted as the most appropriate and valid measure of handedness for this questionnaire. 
The use of such a 1—0 weighting system would give a reasonable approximation of a 
respondent’s factor score (Gorsuch, 1974). The use of such a score would be most 
applicable in situations where subjects are to be grouped according to handedness, or 
where handedness is to be used as a continuous covariate. 

The laterality quotient, which gives equal weighting to all items, would appear to be 
influenced by variation arising from sources other than handedness and consequently its 
validity may be questionable. In this regard the finding of a significant age effect with LQ 
scores in the present study has a number of implications. It is all too easy to envisage 
researchers finding age (and possibly sex) effects in laterality variables when subjects have 
been allocated to groups (or data analysed) according to LQ scores. This problem may be 
more widespread and future research may need to consider the issue of factor (construct) 
validity when other questionnaires and behavioural methods are used to assess handedness. 
There would appear to be a major problem in assessing the often subtle age and sex effects 
found in laterality research when inappropriate weighting has been given to ‘handedness’ 
items susceptible to extraneous sources of variance. 
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Presentation and representation in design problem-solving 
John M. Carroll, John C. Thomas and Ashok Malhotra 





Two experimental studies of design problem-solving are presented. Eighty-one subjects worked on 
one of two design problems that were isomorphic in structure: a schedule for stages in a 
manufacturing process or a layout for a business office. In Expt 1, a difference between problem 
isomorphs is obtained: the ‘spatial’ office layout problem obtains better performance and shorter 
solution times than the ‘temporal’ scheduling problem. In Expt 2, this difference attenuates when 
subjects are provided with a graphic representation in both isomorph conditions. The availability of a 
graphic representation 1s discussed as an aid for procedural design. 





Variables of presentation and representation have received considerable attention in studies 
of human problem-solving. Schwartz (Schwartz, 1971; Polich & Schwartz, 1974) has shown 
that making a graphic means of representing solutions available improves subjects’ 
performance in deduction tasks. Reed et al. (1974) and Simon & Hayes (1976) have shown 
that subjects’ performance can differ for logically isomorphic versions of certain problems, 
when those versions are presented with different ‘cover stories’. 

The present study explores the variables of presentation and representation in a ‘design’ 
problem-solving task environment (Thomas & Carroll, 1979). Design problem-solving 
belongs to that relatively under-studied area of human problem-solving that Reitman 
(1965) associated with ‘ill-structured’ problems. Problem-solving behaviour in this domain 
characteristically cannot be specified minutely as a set of moves, selected from a small and 
finite initial state in order to derive a unique final or goal state. 

A designer, typically, does not know in advance what the goal state will be, although he 
usually has criteria to evaluate potential goal states. Indeed, the designer often does not 
even have a definition of the initial problem state, or of the allowable moves, Simon (1973) 
contends that the formal and behavioural analysis of ill-defined problems, such as design 
problems, can be accommodated by the theoretical apparatus developed already to treat 
well-defined problems (e.g. Newell & Simon, 1972) — but relatively little argument and no 
empirical evidence is adduced to this claim. 

The possibility exists, however, that there are substantive differences between well- 
structured and ill-structured sorts of problem-solving. Much of ‘ordinary’ (i.e. real-world) 
problem-solving is concerned with ill-structured, and not well-structured, problem 
situations. Thus, an analysis of human problem-solving that treats only the well-structured 
sort certainly risks being an inadequate analysis. One important task for human 
problem-solving research is to extend the existent analyses of well-structured problems to 
the domain of ill-structured problems. 

The two experiments described here address presentation and representation in a problem- 
solving environment that shares many of the features of design. Presentation is manipulated 
in two ways. First, we contrast a ‘temporal’ presentation of the problem with a logically 
isomorphic ‘spatial’ presentation. Second, we compare three different sequencings in which 
problem information can be presented to the problem-solver. Representation 1s addressed 
by comparing performance of subjects who are provided with a graphic means of 
representing their solutions (Expt 2) with that of subjects not provided with a means of 
representation (Expt 1). Subjects in Expt 1 expressed their design solutions in any way they 
wanted. Subjects in Expt 2 solved the same design problems, but expressed their solutions 
using a prescribed graphic representation scheme. 
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Experiment 1 


In Expt 1 subjects were presented with one of two isomorphic variations of a design 
problem: a temporal isomorph and a spatial isomorph. The various design requirements 
comprising the problem were presented to subjects in one of three sequencings, varying 
from highly structured to unstructured. The experiment attempted to test for effects of 
sequencing and isomorph on the dependent variables of performance success and solution 
time. 


Method 


Problem isomorphs. The design problem was presented to cach subject in one of two isomorphic 
versions. The cover story for the temporal isomorph involved designing a manufacturing process for 
‘widgets’, which manufacturing process consisted of seven stages. Each stage was to be assigned to a 
factory shift, at a particular priority level (if workers get behind during their shift, they do the higher 
priority work first). 

To obtain the spatial cover story, we replaced key content words of the temporal cover story with 
‘spatial’ words. The spatial cover story involved designing a business office layout. The office was to 
accommodate seven employees. Each employee was to be assigned to a corridor a certain number of 
offices down from a central hallway (workers who are higher in prestige prefer to have their offices 
nearer to this central hallway). 

Both problems were defined by a set of: 19 ‘functional requirements’. These consisted of six of each 
of three types of relation, plus a ‘compactness goal’. In the spatial isomorph condition, the three 
relations that could occur between entities were: (1) ‘is compatible (incompatible) with’, (2) ‘has 
more (less) prestige than’, (3) and ‘uses the accounting records (meets people in the reception area) 
more than’. In the temporal isomorph condition, the corresponding relations were: (1) ‘uses different 
(the same) resources than (as)’, (2) ‘is a higher (lower) priority manufacturing stage than’, and (3) 
*should follow (precede)'. The following are examples of relations in the temporal isomorph problem 
Statement: 

F is a higher priority manufacturing stage than B. 

A should follow stage C. 

G uses different resources than F. 
Subjects were presented with a total of 18 such relations (3 types x 2 polarities — e.g. precede versus 
follow — x3 of each). 

There was one further goal given to the subjects, the compactness goal. In the temporal isomorph, 
they were told to minimize the total number of shifts in which they organized the stages comprising 
the manufacturing process. In the spatial isomorph, they were told to minimize the total number of 
corridors in which they organized the seven offices. Thus, subjects were asked to ‘compact’ their 
design solutions, restricting the total number of shifts or corridors they posited. (The 19 functional 
requirements for both isomorphs are presented in the Appendix.) 

It is important to note that the 19 functional requirements of the design problem 1nvolved 
‘trade-offs’. For example, the temporal isomorph subjects were told that 'C should precede stage F’, 
that ‘G uses different resources than F’, and that ‘D should follow stage G’. Now, if G uses different 
resources than F, it ought to be scheduled for the same shift as F (for optimal efficiency, see below). 
And, since C should precede stage F (and therefore stage G) and D should follow stage G, we can 
conclude that C precedes D. However, our subjects also got the functional requirement ‘C uses 
different resources than D’. This requirement can be optimally satisfied only if D and C are 
scheduled for the same shift, which conflicts with the other requirements. 

As a result of these trade-offs, there is no perfect solution to the design problems we used - there is 
no solution that necessitates only one shift (or corridor) and satisfies all 18 of the remaining 
functional requirements. In fact, even ignoring the compactness requirement, there is no possible 
solution which satisfies all of the remaining 18 requirements. There are, of course, better and poorer 
solutions (see below), but as far as we can tell, there is no unique optimal solution either. This 
situation is typical of one aspect of ill-defined problem environments. Subjects were encouraged to 
find the best possible solution, given the inherent trade-offs of the problems. 


Sequence of presentation. The initial cover story was about 1} typed pages in length for both isomorph 
conditions, and comprised the first two pages of a booklet given to subjects to read at their own pace. 
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Following this cover story, the 19 functional requirements were distributed over the next four pages 
of the booklet. On each page, the subject was presented with six of the 18 relations or with the 
compactness goal. There were three possible presentation sequences for the 19 functional 
requirements. 

In the hierarchical presentation (HP) condition, page 3 contained a statement of the overall goals 
of the design problem. Thus, in the temporal isomorph problem statement subjects were told that the 
organization of the manufacturing process should be ‘efficient’ and ‘effective’. Efficiency, they were 
told entailed (1) that the total number of shifts required by the process should be minimized and (2) 
that processes that could be scheduled for the same shift (i.e. processes that use different resources) 
should be. Effectiveness, they were told, meant (3) that stages should be optimally sequenced (i.e. 
stages should follow or precede one another as specified) and (4) that priorities should be taken into 
consideration in the schedule. Each of the four following pages of functional requirements elaborated 
one of these goals. (An isomorphic set of goals were presented to subjects in the spatial HP 
condition.) 

The clustered presentation (CP) condition was like the HP condition except that the statement of 
overall goals was not included. Thus, subjects in the CP condition were given no explicit framework 
for the 19 functional requirements, although their four pages of functional requirements were 
thematically clustered vis-d-vis these overall goals. 

The non-structured presentation (NSP) condition, like the CP condition, lacked an overall 
statement of the goals of the design problem. In addition, however, the 18 relations were ‘jumbled’. 
Hence, instead of getting all of the six requirements pertaining to the use of resources on a single page 
of the booklet, subjects in the NSP condition were presented with three requirements pertaining to 
resources mixed with three pertaining to priority. On a subsequent page they were presented with the 
three remaining resources requirements, this time mixed with three temporal sequencing (precede/ 
follow) requirements. 


Design, subjects and procedure. The design of the experiment was a 3 x 2 factorial, whose factors were 
‘sequence of presentation’ (HP, CP and NSP) and ‘isomorph presentation’ (spatial and temporal). 

A total of 36 students from small local colleges participated in the experiment The subjects were 
run in groups of 12 each and were paid for their participation. Each subject was assigned to one of 
the six conditions. Subjects were given & booklet, and asked to read the first two pages of general 
instructions. After studying the instructions, subjects were invited to ask questions concerning the 
experiment. The experimenter reviewed the contents of the instructions, and then allowed subjects to 
proceed to page 3 of the booklet. At this point, subjects started work on the design problem. 

Subjects worked at their own pace. On each page of the booklet they were presented with 
additional information (further functional requirements) and asked to ‘design’ a solution to the 
problem on the basis of what they then knew. Subjects were not instructed as to how their design 
solutions should be expressed; they could express their solutions in any way they choose. It was 
emphasized that these ‘intermediate solutions’ were very important to the purpose of the experiment 
and that they should be taken as seriously as the final solution which the subject would ultimately 
design when all of the information had been presented. While working on any given page of the 
booklet, subjects were permitted to turn back to any previous page of the booklet, to review their 
previous work, to review the instructions, or to check on functional requirements presented earlier. 
They were forbidden to change any of their previous work or to look ahead in the booklet. 

After all of the functional requirements were presented, the subject was asked to give a final 
solution (page 8 in HP, page 7 in CP and NSP). The final two pages of the booklet consisted of a 
questionnaire, asking subjects about their strategies, feelings about the experiment, and previous 
designing experience. (There were no gross differences between subject samples.) The entire 
experimental session took about two hours. 


Scoring. Subjects' solutions were scored for both performance and solution time. Performance was 
scored by giving one point for each of the 18 functional requirements satisfied. The compactness 
requirement was scored by giving 4 points if the subject's solution consisted of two shifts (corridors), 
2 points if it consisted of three shifts (corridors), and 1 point if it consisted of four shifts (corridors). 
No subject produced a solution consisting of more than four shifts (corridors). The compactness 
requirement was weighted in this way 1n order that it would count about equally with the other three 
pages of functional requirements which presented six functional requirements each. In general, a 
good score for the other groups of functional requirements was a 4, more rarely a 5 (out of 6), with 
scores ranging down to 1, and tn a few cases 0. 
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To measure solution times, the experimenter began timing when the subjects turned to the third 
page of the experimental booklet and marked the elapsed time as each subject completed the final 
design solution. 


Results and discussion 


One subject failed to complete the experiment and was discarded from the analysis. The 
summary data for the remaining 35 subjects is presented in Table 1. Analysis of variance 
for the 2 x 3 factorial of isomorph by sequence were computed for the performance and 
solution time measures. We used the method of expected equal frequencies (Ferguson, 


Table 1. Mean performance scores and solution times: Expt 1 








Spatial isomorph Temporal isomorph 
Performance Solution Performance Solution 
Scores times Scores times 
Hierarchical presentation 13-00 42-00 77 43-50 
(8-20) (43-20) 
Clustered presentation 11-50 38-83 10-33 44-00 
(10-33) (44-00) 
Non-structured presentation 11-83 34-60 7-50 51-67 
(9.25) (55-00) 
Overall 12-06 38-50 8-33 46-39 
(9-33) (47-33) 


1971, pp. 238—239), since there was a missing observation in one cell.* This ANOVA is 
justified in so far as the X* of expected cell frequencies does not depart from chance, that is, 
in the data to be reported all X? values fail to reject the null hypothesis of unequal cell 
frequencies (and thus legitimize the ANOVA). In each case we report the obtained X? in 
parentheses. 

For performance scores (X* — 0-14, d.f. — 5), a significant main effect of problem 
isomorph obtains F — 12-99, d.f. — 1, 29, P « 0-001. This reflects the fact that subjects in 
the spatial isomorph condition had higher performance scores. The effect of sequence (HP 
vs. CP vs. NSP) and the interaction effect are both non-significant. For the solution time 
data (X* — 0-26, d.f. — 5), the factor of problem isomorph is again significant, obtaining 
F = 7-01, d.f. = 1, 28, P < 0-025. (One subject failed to signal the experimenter when he 
finished the problem, and as a result had no solution time recorded.) Subjects in the 
spatial isomorph condition obtained shorter solution times than temporal isomorph 
subjects. The main effect of sequence is again non-significant, but the interaction effect is 
nearly significant, F — 2-47, d.f. — 2, 28, P « 0-10. Temporal subjects obtained shorter 
solution times with more structured sequence of presentation (i.e. they were faster in the HP 
condition than in CP and NSP, and faster in CP than in NSP). Spatial isomorph subjects, 
in contrast, were slower in the more structured conditions. 

Close inspection of the comments of three subjects revealed that they did not properly 
understand the experimental task. That is, they did not seem to understand the relations, 
like ‘priority’ and ‘sequence’, upon which the problem's functional requirements were 
defined. Since all three of these subjects belonged to the temporal isomorph group, this 
discovery raises questions about the effect of problem isomorph just cited. In order to 


* We use the method of expected equal frequencies throughout. We have also checked these results using the 
method of proportional frequencies (Ferguson, 1971, pp. 239-241) and find no discrepancies 


Presentation and representation 147 


clarify this matter, we performed further analyses of variance, discarding the data of these 
three subjects as ‘comprehension failures’. 

For the performance data (X* = 0-63, d.f. = 5), the main effect of isomorph presentation 
persists, F = 9-35, d.f. = 1, 26, P « 0-005. There is no effect of sequence of presentation 
and no interaction. For solution time (X* — 0-55, d.f. — 5), we also still obtain a main effect 
of the isomorph presentation factor, F — 9-70, d.f. — 1, 25, P « 0:005. There is, again, no 
main effect of sequence of presentation, but a nearly significant interaction term, F = 3-12, 
d.f. = 2, 25, P < 0-10. 

In summary, the results of Expt 1 seem to show a reliable difference between the two 
isomorphic versions of the design problem: the temporal isomorph is solved more slowly 
and less successfully than the spatial isomorph. On the other hand, our ‘sequence of 
presentation' variable obtained only a marginal effect on solution time and no measured 
effect on performance. 

It is notable that all of the 17 subjects in the spatial isomorph group used a graphic 
representation (a layout drawing) of the business office in the course of solving the design 
problem. However, only two of the temporal isomorph subjects used such a representation 
Nine others produced a discursive listing of facts pertaining to the final manufacturing 
schedule, and the remaining four (not counting comprehension failures) employed . 
discursive paragraphs. The fact that a graphic representation for problem information and 
solutions is more readily available in the spatial isomorph condition may be causally 
related to the fact that subjects in the spatial condition have higher performance scores and 
shorter solution times. This interpretation is encouraged by Schwartz's (1971) finding that 
graphic representations can function as problem-solving aids. 

Alternatively, the isomorph difference we measured might reflect fundamental conceptual 
differences between time and space. This possibility is encouraged by the fact that all of the 
three ‘comprehension failures’ we recorded were from the temporal isomorph presentation. 
(If there were some bias 1n the materials, the temporal isomorph would have been expected 
to be more easily comprehended since it, and not the spatial isomorph, was the source 
pattern for both problem statements.) 


Experiment 2 


Experiment 2 attempts to further clarify the basis of the isomorph differences discovered in 
Expt 1; in particular, to characterize the role which the availability of a representation 
might play in creating this difference. In Expt 2, subjects were provided with and trained to 
use a graphic representation. If the performance and solution time isomorph differences of 
Expt 1 were due to the relative availability of a graphic representation, this manipulation 
should attentuate the differences. If, on the other hand, the isomorph differences were 
based in more fundamental conceptual differences between the two problem versions, the 
isomorph difference should persist. 


Method 


Design and materials. Experiment 2 has the same 3 x 2 factorial design as Expt 1. In fact, the 
materials for the two experiments were identical with one exception. In Expt 2, subjects were 
provided with a matrix representation in which they were to record their intermediate and final 
design solutions. An example of this matrix is given ın Fig. 1. 

In the spatial isomorph problem statement, the horizontal dimension in the matrix was the office's 
position with respect to the accounting records and reception area (accounting record at the extreme 
right, reception at the left). The vertical dimension was defined as corridors containing offices, the 
very top row of the matrix was defined as bordering the central hallway (near to which higher 
prestige employees like to have their offices). Thus, within a column of the matrix height represents 
prestige. In Fig. 1, A is closer to the accounting records than B and C. B and C are located on the 
same corridor, and C has greater prestige. 
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Figure 1. Example of matrix representation provided to subjects in Expt 2. 


In the temporal isomorph, the horizontal dimension was a time line — columns to the right are 
‘earlier’ than columns to the left. Columns themselves represented ‘shifts’, and height in a column 
represented ‘priority’. Thus, in the example A is scheduled two shifts before C and B. C and B are 
scheduled for the same shift, and C has been assigned higher prionty. 

The initial two pages of the experimental booklet were modified to explain and illustrate the use of 
this representational scheme (adding another half page of material to the instructions). Each 
succeeding page of the booklet (intermediate solutions and final solutions) had a blank matrix printed 
out on it. Subjects were asked to record their design solutions in the matrix. 

In all other ways, the materials for the two experiments were identical. 


Procedure. 'The procedure for Expt 2 was identical to that of Expt 1 with the exception that the 
experimenter also explained the representation in the initial instruction period. A total of 45 subjects 
participated in the experiment. They were drawn from the local college student population, and were 
each paid for their participation. They were run in groups of 10 to 12 subjects each. Either 7 or 8 
subjects were randomly assigned to each of the 6 (sequence x isomorph) conditions: 23 were 
assigned to the temporal isomorph conditions, and 22 to the spatial isomorph conditions. The entire 
experimental session took about 2j hours. 


Results and discussion 


Summary data for Expt 2, for both performance and solution time, are presented in Table 
2. We analysed the data from Expt 2 by the same ANOVA procedures used to analyse 
Expt |. For the performance data (X° = 0-20, d.f. = 5), a significant main effect of problem 


Table 2. Mean performance scores and solution times: Expt 2 








Spatial isomorph Temporal isomorph 
Performance Solution Performance Solution 
Scores times Scores times 
Hierarchical presentation 13-14 65-14 11:25 62-25 
(12-00) (55-67) 
Clustered presentation 12-71 53-14 1171 67 57 
(12-00) (64 17) 
Non-structured presentation 12:50 62-29 10 00 71:63 
(11 00) (69-33) 
Overall 12:77 60-19 10-96 67-13 
(11-67) (63-06) 





isomorph obtains, F = 5-13, d.f. = 1, 39, P < 0-05. There is no significant effect of sequence 
and no interaction. For solution time (X? = 0.18, d.f. = 5), however, there are no significant 
treatment effects in the ANOVA. (One of the 45 subjects failed to signal the experimenter 
upon completion of the problem. As a result, one solution time is missing.) 
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When we inspected the comments and intermediate solutions of our subjects, we again 
found cases of apparent comprehension failure. A total of five such cases were detected. 
Since all five of these cases belonged to the temporal isomorph groups, we performed 
further ANOVAs discarding the data of these subjects. For performance (X? = 0-50, 

d.f. = 5), there are no significant main effects or interactions (for the effect of problem 
isomorph, F = 1-88, d.f. = 1, 34, P < 0-2). For solution time (X? = 0-23, d.f. = 5), there are 
also no significant effects. 

In sum, when subjects are provided with a graphic representational scheme, differences in 
performance and solution time between the spatial and temporal problem isomorphs 
diminish (to less than 40 per cent of what they were in Expt 1). Accordingly, the relative 
availability of a graphic representation does emerge as an effective difference between the 
spatial and temporal isomorphs. We base these conclusions, of course, on an acceptance of 
the null hypothesis, something one generally wants to avoid. However, if an isomorph 
difference as strong as that we measured in Expt 1 had been present in Expt 2, a power 
analysis estimates that the probability of our finding it would be at least 0-97. 

Providing temporal subjects with a graphic representation does not collapse all isomorph 
differences. The tendency for temporal isomorph subjects to experience greater 
comprehension problems than the spatial isomorph subjects, noted in discussion of Expt 1, 
is still apparent. In fact, the tendency for temporal isomorph subjects to experience 
‘comprehension failure’ is significantly different from chance, P < 0-005, by sign test, 
pooling across both experiments. Five of 23 subjects in the temporal group in Expt 2 (and 3 
out of 18 in Expt 1) failed to adequately comprehend the problem, while none of the 22 
subjects in the spatial group fell into this category. 


General discussion 


First, we will comment on the failure of the present study to detect an effect of sequence of 
‘presentation’, then we will discuss the effects of isomorph ‘presentation’ and 
‘representation’ that were detected. The sequence variable was intended to manipulate the 
degree to which the implicit goal structure of the design problems were made apparent to 
the subject-designer. In the HP condition, these goals were explicitly spelled out. In the CP 
condition, they were implicit in the clustering of the functional requirements vis-d-vis the 
pages of the experimental booklet. In the NSP condition, the goal structure was obscured. 

Since we find no reliable effect of sequence, one might want to conclude that the 
transparency of a design problem’s goal structure doesn’t matter — at least in the case of 
our design problems. Perhaps the design problem is too artificial and the subjects were just 
manipulating the A, B, C,...entities formally and ignoring the details of the cover story. 
However, neither the obtained differences between isomorphs, nor the comments subjects 
spontaneously rendered are consistent with such an ‘artificiality’ argument. Subjects often 
wrote notes on their solutions like ‘B is incompatible with everybody, Pll have to put him 
off by himself’. These observations suggest that subjects were involved with the cover story, 
and indeed, that to some extent they approached the design problem as if it were a 
real-world problem. 

It is also possible that the particular presentation sequences we contrasted are largely 
ineffective, but that some other sequence manipulations might indeed control performance 
and solution time variables. For example, it might be the case that sequence of presentation 
variables are effective when they effectively structure a complex problem into 
*subprobiems' (Thomas, 1974; Carroll et al., in press). Our three sequence conditions 
equally did not allow the overall design solution to be independently decomposed into 
subproblems. Perhaps significant effects would have been found with a sequence 
manipulation in which subjects could, for example, design the organization of shift number 
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one based on the information they are presented with on page one of the booklet, shift 
number two based on information they receive on page two, etc. Each shift could be 
designed (almost) independently of any other shift. Such a sequence manipulation might 
obtain an effect on performance and solution time measures in contrast to a set of 
sequences which force subjects to reanalyse their entire design each time they received new 
information (i.e. like the sequence conditions in the present study). This possibility remains 
for further research. 

Since providing subjects with a representation largely mitigates the effect of the problem 
isomorph variable reported in Expt 1 vis-à-vis the spatial and temporal versions, it seems 
that the difference between the two isomorphs can be attributed, at least in part, to the 
relative availability, or accessibility, of representations. Thus, it is argued that a 
representational scheme is more available to subjects in the spatial condition, and they are 
therefore able to solve the problem faster and with greater success (Expt 1). However, when 
subjects in both conditions are provided with an equally powerful graphic representation, 
differences between the two isomorph conditions attenuate (Expt 2). 

Comprehension failures, however, are no less common in the temporal isomorph group 
of Expt 2 than they are in the temporal group of Expt 2. Hence, at least one difference 
between the spatial and temporal isomorphs is not neutralized by making a representation 
available. It appears that while the availability of a representation may differentiate the two 
isomorphs with respect to ‘problem solution’, it does not differentiate them with respect to 
‘problem understanding’. That is, having a graphic representation seems to make the 
problem easier to solve, but not easier to understand (but cf. Mayer, 1976). It is still 
relatively more difficult to understand the temporal problem (as indexed by comprehension 
failures) in both experiments. 

Some additional perspective on these matters may be provided by pooling data from the 
two experiments, and analysing ‘representation’ directly as a factor in an ANOVA. This is 
justified in that the only difference between the materials and procedure of the two 
experiments resides in whether or not a representation was provided to the subject (Expt 
2), or not (Expt 1). A 2x2 ANOVA was performed for the performance and solution time 
data of Expts 1 and 2 pooled for the factors of representation (Expt 1 versus Expt 2) and 
isomorph presentation. 

For performance (X* = 1-66, d.f. = 3), the representation factor obtains F = 7-11, 

d.f. = 1, 75, P < 0-01. This result shows that the Expt 2 subjects, who had a representation 
provided, attained higher performance scores. Problem isomorph factor obtains F = 17-46, 
d.f. = 1, 75, P < 0-001. This shows that subjects in the spatial isomorph conditions 
attained better performance scores overall. There is a non-significant interaction of 
isomorph and representation, F = 1-90, d.f. = 1, 75, P < 0-2. For solution time (X° = 1:49, 
d.f. = 3), there are also main effects of representation, F = 43-45, d.f. = 1, 74, P < 0-001, 
and isomorph presentation, F = 5-31, d.f. = 1, 74, P < 0-05. These differences reflect the 
fact that subjects in Expt 1, who had no representation provided, obtained shorter solution 
times, as did subjects in the spatial isomorph conditions overall. There is no representation 
by isomorph interaction. 

As before, we have also computed the ANOVAs discarding comprehension failures. For 
performance (X? = 1-62, d.f. = 3), there is a main effect of representation, F = 6:88, 

d.f. = 1, 67, P < 0-025, and of problem isomorph, F = 9-43, d.f. = 1, 67, P < 0-005. The 
interaction term is non-significant. For solution time (X? = 1-20, d.f. = 3), the factor of 
representation obtains F = 29-90, d.f. = 1, 66, P < 0-001, and the factor of isomorph 
presentation obtains F (1, 66) = 2-92, d.f. = 1, 66, P < 0-10. The interaction term is 
non-significant. 

These analyses argue that availability of a representation does not exhaust the difference 
between spatial and temporal isomorphs. The factor of problem isomorph obtains 
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significant main effects in the pooled analysis, and does not interact significantly with the 
factor of representation. Hence, we still cannot rule out the influence of what we referred to 
earlier as conceptual differences. 

The effects of the representation factor itself suggest some further hypotheses about the 
role of representation in problem solving. There is a highly significant tendency for subjects 
in Expt 2 to take more time in solving the design problem, independent of isomorph 
presentation. Further, there was a significant tendency for these subjects to obtain higher 
performance scores. 

Perhaps, the longer solution times in Expt 2 are merely due to the fact that these 
subjects had more to learn (i.e. they had to learn to use the representation). We tried to 
eliminate this possibility by not including the instruction period in the solution time. 
However, it could be that subjects did not realize their lack of understanding of the 
representational scheme until after the instruction period, and hence the time they spent 
learning the representation was actually included in their solution time. On this view, the 
advantage of the graphic representation lies perhaps in its providing to the subject a 
recording medium that helps maintain and integrate previous intermediate solutions (see 
Greeno, 1973). 

Another possibility, however, is that the improved performance of the representation 
subjects is more directly related to the longer amount of time they spend on the problem. 
In this view, the benefit of the representation resides in its encouraging the subject to work 
longer on the problem. Indeed, this could be just the sense in which a representation can 
function as a problem-solving aid. In any case, the present experiment does seem to 
indicate that graphic representations can act as aids in ill-structured temporal design 
problems — and thus is a preliminary empirical justification for the use of such aids in 
real-world temporal design task environments like computer programming (see van Tassel, 
1974, for discussion of 'flow-charting' and related aids). 


Summary 

The present investigation suggests that the efficacy of graphic representations in solving 
well-structured deduction problems (e.g. Schwartz, 1971), may generalize to ill-structured 
problem domains like design. Furthermore, it was suggested, certain presentations of 
problem information encourage graphic representation and are (thereby) rendered easier to 
solve (spatial versus temporal in Expt 1). 

This study also elaborates previous analyses of problem isomorphism, distinguishing, in 
particular, between spatial and temporal isomorphs. The spatial isomorph, in the present 
study, obtained better performance and faster solution times (Expt 1), and occasioned fewer 
comprehension failures than the temporal isomorph. This shows that the intuitive 
distinction between time and space, like the ‘transfer’ and ‘change’ comparison studied by 
Simon & Hayes (1976), can be an effective variable in presenting problem information. We 
have also attempted to clarify the basis of isomorph differences, suggesting that these 
differences reside in the extent to which a problem statement makes a useful 
representational scheme available, or accessible, to the problem-solver (see Simon & Hayes, 
1976). 

Finally, we have developed a paradigm which allows for objective assessments of 
problem-solving behaviour in relatively ill-defined problem-solving environments. 
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Appendix 


1. Functional requirements for temporal isomorph 


A uses the same resources as B, 
F uses different resources than A, 
B uses the same resources as G, 
C uses different resources than D, 
E uses the same resources as B, 
G uses different resources than F, 


The total number of shifts required should be as small as possible, 


B should precede stage D, 
E should follow stage G, 
A should follow stage C, 
F should precede stage E, 
D should follow stage G, 
C should precede stage F, 


F is a higher priority manufacturing stage than B, 
C is a lower priority manufacturing stage than B, 
G is a lower priority manufacturing stage than C, 
D is a higher priority manufacturing stage than F, 
E is a higher priority manufacturing stage than G, 
A is a lower prionty manufacturing stage than D. 
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2. Functional requirements for spatial isomorph 


A is incompatible with B, 
F is compatible with A, 
B is incompatible with G, 
C is compatible with D, 
E is incompatible with B, 
G is compatible with F, 


The total number of corridors office space is rented on should be as small as possible, 
B uses the accounting records more than does D, 

E meets people in the reception area more than does G, 

A meets people in the reception area more than does C, 

F uses accounting records more than does E, 

D meets people in the reception area more than does G, 

C uses the accounting records more than does F, 


F has more prestige than B, 
C has less prestige than B, 
G has less prestige than C, 
D has more prestige than F, 
E has more prestige than G, 
A has less prestige than D. 


x 
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Reaction time as a function of the cardiac cycle 


V. T. Wynn 





Simple reaction times to auditory stimuli varied with the phase of the cardiac cycle in which the 
stimulus was presented. Longitudinal studies showed that stimuli triggered at different phases in the 
ECG gave rise to different rhythmic patterns of behaviour. 





Conflicting results have been published on studies of reaction time as a function of the 
electrocardiogram (ECG) in human subjects. Some workers (Birren et al., 1963; Callaway 
& Layne, 1964) have found variations in reaction times which they have related to central 
nervous system changes caused by the effect of arterial pressure fluctuations on the 
baroreceptors of the arterial system. Others (Bonstock & Jarvis, 1970; Botwinick & 
Thompson, 1971; Elliot & Graf, 1972) have not been able to confirm these results. Earlier 
work has shown (Wynn, 1973) that simple reaction-time responses vary day by day ina 
cyclical manner for both male and female subjects. A correlation was observed between the 
daily mean of the reaction-time responses of the female subject and her menstrual cycle. 
The reaction times speeded up during two phases of the menstrual cycle — just before 
menstruation and again at the time of ovulation, and slowed down between the two phases. 
Other studies on the phenomenon of ‘absolute’ pitch (Wynn, 1972), the inbuilt ability 
that some people have to estimate the pitch of a note to within an accuracy of a few hertz, 
have shown that the estimation of pitch also varies day by day in male and female subjects. 


Method 


In an attempt to reduce the large moment-to-moment variations in reaction times an approximately 
constant foreperiod procedure was chosen, and any perturbing effect of the ECG was minimized by 
arranging for the reaction-time stimulus to be triggered at a particular point in the subject's own 
ECG. The foreperiod was governed by the arrival of the next reference point in the ECG to occur 
after a fixed dead time of 5 s. The constant 5 s delay was timed electronically from the preceding 
stimulus. By this means the naturally occurring variation in the heart rhythm introduced a little 
uncertainty in the foreperiod. The study was carried out in three parts. In the first experiment the 
onset of the R. wave was singled out from the ECG to act as a triggering point and this was 
compared with the case when the stimulus was independent of the cardiac cycle. In the second 
experiment the T wave and the onset of the P wave were also used as triggering points for the 
stimulus. 

The onset of the R wave was obtained by using a Schmitt trigger circuit which was set to fire at a 
voltage level just above that of the height of the P and T waves. The onset of the P wave was 
extracted by using an additional Schmitt circuit which was set to fire at a voltage level just above that 
of the electrical noise in the ECG. This latter circuit produced three output pulses, one for each of the 
P, R and T waves. These were fed into a monostable multivibrator network and a voltage pulse was 
produced with a width which was long enough to cover the time interval between the onset of the P 
wave and the end of the following T wave. With this pulse width the multivibrator latched on to the 
onset of the P wave and remained inert during the following R and T waves (see Fig. 1). 

The term onset in this paper is used to describe the rising part of the wave which is large enough to 
trigger the Schmitt circuit. It is of necessity later in time than the true onset. This is particularly so in 
the case of the P wave where the true onset can be small and lost in the background noise. 

In the R case, although that Schmitt trigger is set at a higher level, the rapid rise of the wave 
delays the Schmitt onset by no more than 2 ms from the base-line cross-over point. 

The triggering point from the P wave was obtained from the output of the monostable 
multivibrator and that from the R wave was produced by the output of the first Schmitt trigger. For 
the third reference point, the output from the R wave Schmitt trigger was delayed electronically to 
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Figure 1. Separation of the onsets of the P and R waves. 


provide a pulse which, when monitored on an oscilloscope, was observed to coincide with the middle 
of the T wave. 

A previous study (Wynn, 1974) of the various time constants in the ECG as a function of the time 
of day indicated that the time interval between the onset of the P and R waves was most constant 
between 2 p.m. and 6 p.m. The maximum variation between individual P-R intervals, which used the 
same triggers in their measurements as in this study, was found to be no greater than 0-01 s and 
usually less than 0-005 s. The variation between the means of groups of 100 intervals also differed by 
no more than 0-007 s in a 4-hour period. The width of the R wave and the interval between the peaks 
of the R and T waves were analysed by a digital computer and found to vary by no more than 0-002 
s and 0 01 s respectively. These measurements indicate that with respect to the onset of the P and R 
waves, fluctuations in the timing of a delayed stimulus during the P-T interval (due to ECG 
variations) would be less than 0-01 s during each experimental session. Timing errors in the remaining 
T-P interval are more uncertain due to the larger fluctuations of this interval, and are estimated to be 
less than 0-1 s. 

The selected pulses were fed into a stroboscope and reaction times were measured using the 
auditory response of the stroboscope as the stimulus. The stroboscope flash was removed by placing 
the stroboscope in a light-proof box The experimental procedure required the author to react as 
quickly as possible to the auditory click stimulus by pressing a microswitch with the finger. 


Experiment 1 


The original aim of this experiment was to investigate the possibility that the standard 
error in the mean for a large number of simple reaction responses might be reduced by 
triggering the stimulus at one particular point in the ECG. To this end two different sets of 
reaction-time measurements were taken. In the first set the stimulus was independent of the 
cardiac cycle but in the second set the onset of the R wave was used to trigger the stimulus. 
In both cases there was an interval between responses of approximately 5 s. In the first set 
the interval was constant between the initiation of the response and the firing of the 
following stimulus. In the second case the variability in the heart rate introduced a little 
uncertainty in the time interval. 

The daily mean of more than 300 responses (in some cases more than 400) was 
calculated for each set. Readings were taken in two groups (one group for each set) at two 
to three regular times through the day, and the longitudinal study was continued for 7 
weeks. 


Results. The original aim of the experiment was set aside when the results of the study were 
analysed. When the daily means were plotted as a function of time (in days) a striking 
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Fig. 2. Autocorrelation coefficient as a function of lag in days for the daily mean of 300 reaction times 
for: (a) stimuli presented at random through the ECG; (b) stimuli triggered by the onset of the R 
wave. 


difference was observed between the two sets of readings. The reaction-time responses that 
were unrelated to the ECG showed the cyclic behaviour that had already been noted by the 
author (Wynn, 1973). The mean responses varied with a periodicity of approximately 19 
days, and superimposed upon this cycle was some evidence of the shorter (approximately 
weekly) cycle that had also been found earlier. 

Figure 2(a) is a correlogram plot. In this plot, correlation coefficients have been 
computed between the original time series and the same time series when the latter has 
been delayed by a variable amount (lag time). If the time series varies periodically then the 
autocorrelation coefficients also vary between +1 with the same periodicity. The longer 
cycle is clearly seen in this plot of autocorrelation coefficient against lag (in days). The 
shorter rhythm, although hinted at in the raw data, has not enough statistical significance 
to make itself known in the presence of the longer cycle in Fig. 2(a). The earlier work 
(Wynn, 1973) where the shorter rhythm was clearly observed covered a much longer period 
of time. 

The raw data for the R triggered response times produced a much smoother curve, i.e. 
the longer rhythm was more evident still and there were no indications of the presence of 
the shorter rhythm. Figure 2(b) is the correlogram plot for the latter set of results and it 
can be seen that the swing between the maximum and minimum is greater in this figure 
than in 2(a). 

The apparent disappearance of the shorter rhythm prompted the author to take an extra 
set of readings per day for stimuli triggered by another different part of the ECG. The 
onset of the P wave was chosen for convenience. Although these last measurements were 
continued for only two weeks, it seemed evident that the plot of their daily means as a 
function of time (in days) bore little resemblance to the plots from the other two sets. 
However, the newest set of readings indicated the presence of the shorter rhythm but this 
time the cycle appeared to be 180? out of phase with the shorter rhythm that was noted in 
the first set. Unfortunately, not enough readings were taken in this last set to produce any 
conclusive statistics and it was decided to continue the investigation after a suitable 
recovery period had elapsed. 

Correlation coefficients were computed for the two longer sets. They were positively 
correlated at much better than the 0-01 level of significance (0-67 for 41 pairs of readings). 
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Conclusions. The results clearly indicated the presence of an approximately 19-day rhythm 
and hinted at the possibility that the heart might play a fundamental role in perturbing the 
reaction-time responses. It seemed possible that the two cycles previously observed by the 
author could be separated into their two components by triggering the stimuli at different 
points in the ECG. 


Experiment 2 


Two months after the first experiment the author felt sufficiently recovered to attempt a 
further longitudinal study. In this study, which lasted for 6 weeks, simple reaction-time 
responses were made in a similar manner to that described above except that on most days 
the responses were made in one !+hour session. Three sets of readings were taken, one for 
responses triggered by the onset of the R wave, another by the onset of the P wave, and the 
last for responses triggered in the middle of the T wave. It was not possible to continue the 
responses to stimuli unrelated to the ECG. Two hundred and forty responses were made 
per day to each of the trigger points. 


Results. Figure 3(a) is the correlogram for the R triggered responses. This time the 
periodicity was approximately 16 days. Figure 3(b) shows the correlograms for the P and T 
triggers and indicates the presence of an 8-day cycle. 

Cross-correlation coefficients were computed for the different sets and significant 
correlations were observed only between the P and T triggered sets. These were negatively 
correlated at much better than the 0-01 level of significance (— 0-52 for 39 pairs of 
readings). ; 


Conclusion. The R triggered results produced a similar rhythmic fluctuation to that in the 
first experiment, although of a slightly shorter period (16 days). The P and T results clearly 
picked up the shorter 8-day rhythm with no indication of the presence of the longer cycle. 
These last two sets also showed one very interesting result ~ that although they underwent 
similar fluctuations in amplitude and periodicity the cycles were in opposite phase to each 
other. 
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Figure 4. Mean reaction time as a function of the cardiac cycle compiled from four different sessions. 
Each session is represented by a different symbol. Each vertical error bar represents the standard 
error in the mean. The horizontal error bars are estimates of the accuracy of the stimulus triggering 
times with respect to the ECG. The horizontal lines represent those parts of the cycle in which the 
mean reaction time varied in a cyclical manner as shown in Figs 2 and 3. 


Experiment 3 


After noting that the mean values of the reaction times varied day by day in a rhythmic 
manner, the period and pbase of which was observed to depend on the position of the 
stimulus in the cardiac cycle, a further study was attempted to find out the extent of the 
heart's control over each rhythm. In this last study the stimulus was triggered at different 
points throughout the cardiac cycle. The stimulus was arranged to fire at the different times 
during the P-R interval by feeding the P voltage pulse into a variable time delay circuit. 
The R pulse was used for the remaining R-P interval. Stimuli were presented through the 
ECG in steps varying in magnitude from 0-02 s during the P-T interval, to 0-1 s during the 
T-P interval. 

It was considered pointless to divide this latter region into smaller steps in view of the 
large variations that occur in the T-P interval. Reaction times were read to 0-001 s and 
enough measurements were made to reduce the standard error in the mean to 
approximately 0-001 s (250 readings). 

The investigation over the ECG was carried out in several sessions and a total of more 
than 35000 reaction times was recorded. Each session lasted about 4 hours and covered one 
part of the ECG. In each session the stimuli were presented at particular points in the ECG 
section in batches of 50 for each point. These batches were distributed randomly in time 
through the section in order to minimize any effect that fatigue might have on the subjects 
concentration. When the chosen section had been covered, the process was repeated until a 
total of 250 reaction times had been recorded for each point. 


Results. Figure 4 shows a plot of the mean reaction time as a function of the ECG for four 
sessions in which the reaction times to stimuli presented at the onset of the P and R waves 
reached their maxima and minima respectively. The results from a further nine sessions also 
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produced fluctuations during the cardiac cycle of similar magnitudes, but the maxima and 
minima were not in the same places. These results were, however, consistent with the 
presence of three fixed intervals in the cardiac cycle in which the reaction times varied in 
the way indicated in Figs 2 and 3. These active sites are represented approximately by the 
horizontal lines under the plots in Fig. 4. The vertical error bars are the standard errors in 
the mean for the responses to each trigger point. The horizontal bars are estimates of the 
uncertainties in the positioning of the trigger pulses relative to the P and R waves shown in 
the ECG trace. 

Stimuli triggered during the times outlined by the outer lines would produce mean 
reaction-time responses which vary rhythmically with the shorter period. Those triggered 
by the active site under the R wave would produce the longer rhythm. 


Discussion 

The first experiment shows that if enough simple reaction-time responses are made by a 
well-motivated subject then the mean reaction time can be accurately determined (standard 
error in the mean of better than 1 per cent). It also shows that the reaction times vary from 
day to day in a rhythmical manner. This behaviour is consistent with earlier work (Wynn, 
1973), some of which was obtained with subjects who were not aware of the results until 
months after the experiment. The experiment also hints at the possibility that the action of 
the heart can in some way perturb the reaction-time responses in a selective manner, that 
is, stimuli triggered at different points in the heart cycle can produce different rhythmic 
perturbations in the response times. The second experiment shows this quite clearly and, 
although the periodicity of the longer cycle has changed from 19 to 16 days, this is 
understandable when one considers the large variations that are commonplace in the 
menstrual cycle. Earlier work by the author on ‘absolute’ pitch suggests that there may 
indeed be a close link between the fluctuations in the daily means of reaction-time 
responses and the hormonal system. This experiment indicates that responses to stimuli 
triggered near the onset of the P wave are perturbed by the heart in a different manner to 
those triggered by the onset of the R wave. The different cycles in the two cases suggests 
two independent mechanisms. The out-of-phase nature of the 8-day T wave cycle, however, 
suggests a connexion with the P wave rhythm. 

An interesting question arises as to the mechanisms involved in these cyclical 
perturbations. Without a doubt the cardiovascular system must play the major role, and it 
seems most plausible that cyclical variations in the impulses from the cardiac feedback 
mechanisms might interfere with the reaction-time responses. Lacey & Lacey (1974) have 
proposed that reaction times are affected by heart rate via impulses from the baroreceptors 
of the carotid sinus, aortic arch, etc. The reticular activating system is suggested as the site 
of interference between the baroreceptor impulses and the motor or sensory nerves involved 
in the reaction-time responses. 

The third experiment was set up to map out, if possible, the times in the cardiac cycle 
when these cyclical effects might be observed. The task was arduous and required a great 
deal of concentration over a series of 4-hour periods, and the results must of necessity be 
crude and open to the criticism of subjectivity. The only answer to such criticisms would be 
to emphasize the small variations that did exist. The author does not feel that such 
variations could be consciously or subconsciously controlled by the subject, bearing in 
mind also that no results were analysed until after the end of the 4-hour period. Ideally the 
author would have liked to repeat this last experiment, step by step, through the two cycles 
but this was just not possible. Three of the 4-hour batches were performed on consecutive 
days but that was all that the author's person could tolerate. 

The approximate timing for the maxima and minima results was obtained by trial and 
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error. A trial batch of responses was made for each of the three trigger times (onsets of P 
and R, and the middle of the T waves) in the morning. If the mean responses suggested 
that the P and T rhythms were at their antinodes, then the 4-hour sessions were carried out 
in the afternoon. A few more trial batches were performed on subsequent days to check the 
timing of the batch with the cycles. 

It is interesting to note that the horizontal lines under the P and T waves are of different 
lengths and are the sites where the opposing cycles occur. The difference in these lines 
would tie in with the fluctuating reaction-time patterns that were hinted at in the first 
experiment. There the stimulus, which was unrelated to the cardiac cycle, produced the 
shorter cycle which was of opposite phase to that produced by the P triggered stimuli. If it 
is assumed that in the former case the stimuli would occur randomly through the ECG 
then one could expect the effect of stimuli occurring in the T active site to swamp out those 
occurring in the P region. 

The horizontal lines in Fig. 4 do suggest one possible way in which the cardiovascular 
system could influence the reaction-time responses. These areas of rhythmic activity occur 
at about the same times as the firing of impulses from the atrial A and B receptors and the 
ventricular pressure receptors. It is well known that the A receptors have a prominent burst 
of impulses which occur in the P-R interval and that the B-type receptors discharge in 
mid-systole. The ventricular receptors are also known to discharge 0:02-0:05 s after the 
QRS complex. 

The active regions suffered no noticeable displacement when the stimulus intensity was 
altered to produce mean reaction times which were 0-06 s slower. This last result suggests 
that if these baroreceptors are responsible then they affect only the sensory side of the 
reaction-time process. This is in agreement with other work (Saari & Pappas, 1976). 

Although the baroreceptors may effect the reaction-time responses, they do not explain 
why the perturbations should be rhythmical. This rhythmicity could be due to cyclical 
fluctuations in any or several of the properties of the cardiovascular system, blood pressure, 
volume, muscle tone, etc., and the activity of the different baroreceptors may be following 
the changes in different ways. For example, it may be that the P and T rhythms are 
responding to changes in blood volume. It is known (Keele & Neil, 1971) that the atrial B 
receptors do respond to changes in blood volume. The R rhythm may be responding to 
blood pressure changes. There are however many different ways in which one could 
envisage methods of introducing rhythmic fluctuations into the activity of the baroreceptors 
and much more work would be required before a satisfactory answer could be given. All 
the early work by the author does suggest that at least the longer cycle is related to sex 
hormone fluctuations, and it is conceivable that the two cycles might be related to hormone 
changes connected with periodic variations in the gonadotrophins. 


Conclusion 


Although the presence of two cycles with different periods is indicated by the experiments, 
it has not been possible to do other than conjecture on the mechanisms producing the 
rhythmic behaviour. It appears that both the hormone and the cardiovascular systems are 
implicated in the perturbation process, and it seems that a model based essentially on that 
proposed by Lacey & Lacey is appropriate, although such a model differs from his in that 
it is not the heart rate but some other parameters that influence the reaction times via the 
reticular activating system. This is, however, conjecture and requires a much deeper 
investigation. The author does not feel that the simple reaction-time technique is suited to 
such an investigation. À much less arduous technique must be found for monitoring the 
cycles and he feels that this might be accomplished by an investigation of the ECG itself. 
Preliminary results (Wynn, 1974) do suggest this to be a fruitful field. Similar rhythmic 
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fluctuations to those mentioned above have been observed in the ECG in both male and 
female, and it seems possible that if the technique could be perfected it would be very 
suitable for investigating correlations between the cycles and arterial pressure and 
hormonal changes, etc. It might indeed be the monitoring technique psychologists need to 


explore the menstrual cycle. 
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Liking words as a function of the experienced frequency of their occurrence 
W. Siuckin, A. M. Colman and D. J. Hargreaves 








A hypothetical inverted-U curve is postulated linking liking of stimuli to familiarity with them An 
experiment using a special procedure was carried out in which the relationship was investigated for 
words, ranging from very unfamiliar to very familiar, between favourability and familiarity. The 
results conformed to the theoretical curve. This indicated that the positive correlation between the 
variables reported by several researchers (e.g. Zajonc) and the negative correlation found by others 
(e.g. Cantor) should be regarded as complementary rather than contradictory. 





Aesthetic judgements have long been thought to depend, among other things, on stimulus 
intensity. This relationship is depicted by the well-known Wundt curve. The curve, as given 
by Wundt and also as presented later by Berlyne (1971), is set out in Fig. 1. The hedonic 
value of a stimulus is regarded by Berlyne as a function, rising to a peak and then falling, 
of the person's arousal; and arousal is considered to be directly related to the novelty of 
the stimulus. 


Pleasantness (Wundt) 
Hedonic value (Berlyne) 






Stimulus intensity (Wundt) 
Arousal (Berlyne) 
Novelty (Berlyne) 


Figure 1. The Wundt/Berlyne curve. 


Novelty in Fig. 1 starts at nought, and this presents a conceptual problem. Zero novelty 
implies that the person is totally familiar with the stimulus. However, the view may be 
taken that such complete familiarity is never, strictly speaking, achieved. Familiarity may 
be thought of as increasing ad infinitum with continued exposure to the stimulus. Complete 
unfamiliarity, on the other hand, clearly occurs when exposure to the stimulus is nil, i.e. 
when the stimulus is entirely strange to the person. 

The difficulty of conceiving of novelty as starting at zero in the Berlyne curve which 
relates hedonic value to novelty has prompted us to propose a function presented in Fig. 2. 
In this curve the axis of abscissae is the reverse of that in Fig. 1, that is high novelty (low 
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familiarity) is now on the left and low novelty (high familiarity) is on the right-hand side of 
the figure. À consequence of this reversal is that at zero familiarity hedonic value (labelled 
favourability by Zajonc, 1968) is negative. This makes intuitive sense in that a strange 
stimulus may well be initially disliked by a person, rather than merely regarded as of 
neutral favourability. It should further be noted that familiarity is directly related to time. 
Thus, the curve in Fig. 2 assumes the form of a time function, linking in an inverted-U 
fashion favourability (or liking the stimulus) to the duration of the person's exposure to the 
stimulus. 


Favourability 


Familiarity/time 


0 + 


Figure 2. The hypothesized curve linking favourability to familianty/time. 


This model relationship is in keeping with everyday experience, as when liking for a new 
tune or poem gradually increases with time and then slowly declines. Two sets of 
experimental findings, however, those stemming from the work of Cantor (e.g. Cantor, 
1968; Cantor & Kubose, 1969) and Zajonc (e.g. Zajonc, 1968; Zajonc & Rajecki, 1969) 
appear not to fit the inverted-U curve (Hutt, 1975). The Cantor-type results indicate that 
familiarization with stimuli reduces liking for them; the Zajonc-type results show that the 
more familiar the stimuli the better they are liked. 

It has been said that Zajonc-type results occur in situations in which the stimuli with 
which the subject is familiarized are complex in relation to the subject's prior general 
experience (Berlyne, 1970; Faw & Pien, 1971). Such stimuli are preferred to similar but 
totally strange stimuli. In studies of this kind the relationship between familiarity and 
favourability is positive and approximately linear. When familiar stimuli are simple in 
character, as in the Cantor-type studies, favourability is thought to decrease with increased 
familiarity in a roughly linear manner (see review of ‘two-factor’ theories by Harrison, 
1977). Thus the varying findings may be only seemingly conflicting; they could be the result 
of differing experimental conditions. It has been suggested that some cases fit the ascending 
part of the inverted-U curve in Fig. 2, some cases fit the descending part, and yet others, in 
which liking was found to be independent of familiarity, fit the top, approximately flat, part 
of the curve (e.g. Crandall et al., 1973; Stang, 1974). However, a common feature of 
well-nigh all the previous studies is the relatively short range over which the familiarity 
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variable has extended. This may well have been responsible for the approximately 
straight-line functions found to link favourability to familiarity, either rising, or flat, or 
falling. 

To investigate the effect on favourability of familiarity ranging widely from very low to 
very high, special experimental procedures have to be employed. In fact, in more recent 
times, some studies have attained this aim by utilizing the subjects’ naturally acquired 
familiarity with common stimuli such as letters and words. Thus, Sluckin et al. (1973), 
using letters and letter-like shapes as stimuli and children as subjects, found that 
‘favourability is a function of exposure, but that additional exposure does not necessarily 
increase favourability and may even reduce it’ (p. 563). Colman et al. (1975), using words 
and word-like syllables as stimuli and children and young adults as subjects, found ‘an 
inverted-U function relating familiarity and liking’ (p. 481). 


Design and methodology 


A few words need to be said about the design and methodology of the present experiment, 
since it differs in certain important respects from most previous research in this area. The 
first somewhat unusual feature is the between-subjects design, used previously by Harrison 
(1969) and Moreland & Zajonc (1977), rather than the much more common within-subjects 
design. In our experiment subjects were randomly assigned to conditions in which they 
were called upon to rate either their familiarity with or their liking for the chosen words. 
One of the advantages of this design feature is that the results are unaffected by any 
hypotheses or expectations on the part of the subjects concerning the relationship between 
familiarity and liking, since none of the subjects knows that these are the two variables 
under investigation. A potential source of artifact in the results, which is present in all 
within-subjects designs, is excluded. 

Another feature of the design sets it apart from most previous work in this area, namely 
the use of subjective measures of both familiarity and liking. Harrison (1969) has used 
ratings of familiarity with persons (public figures) but not with ordinary words. Most 
previous studies have used subjective measures of liking but have manipulated the 
familiarity of the stimuli by varying the number of exposures the subjects have to them. In 
the present experiment, the number of previous exposures varies from zero to literally 
millions but is not known in specific cases. The subjects were requested simply to rate 
familiarity in an analogous fashion to their ratings of liking. Moreland & Zajonc (1977) 
have reported an association between liking on the one hand and both subjective and 
objective familiarity on the other. The reasons for our use of a subjective measure of 
familiarity are (a) the comparatively large variance in familiarity which this enabled us to 
investigate; (b) the fact that objective indices of the familiarity of words (e.g. word counts) 
are not only inevitably obsolescent and culturally biased but also give at best a rough 
approximation to the familiarity of the subjects in a specific experiment with the words 
chosen; (c) that such objective measures are based in any event on averages, whereas the 
subjective procedure enabled us to measure directly the familiarity of each subject with 
each word separately; and (d) that subjective measures have been found to be better 
predictors of favourability than any objective ones (Harrison, 1977). 

The final and possibly most significant design feature 1s the use of naturally occurring 
stimuli of varying degrees of familiarity rather than stimuli whose familiarity has been 
artificially manipulated in the course of the experiment. Thus, following Sluckin et al. 
(1973) and Colman et al. (1975), stimuli are chosen with which the subjects are more or less 
familiar; in the present case they are words. In most previous work in this area, the stimuli 
are initially novel and an attempt is made to manipulate their familiarity by repeated 
exposure. The methodology used in the present experiment, however, allows a much wider 
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range of the familiarity continuum, from complete unfamiliarity in the case of obscure 
words to extremely high levels in the case of common words, to be investigated. 


Method 

Subjects and procedure 

The subjects were 33 young adults (18 females and 15 males) whose ages ranged from 19 to 43, with a 
mean of 23-3 years. Seventeen subjects were randomly assigned to the Familiarity condition and 16 to 
the Favourability condition. 

The quasi-random method of selecting the stimulus words was as follows. From every 10th page of 
the Pocket Oxford Dictionary (rev. ed.) a one-syllable word was selected at random. If the word 
turned out to denote an object or idea of an obviously emotive kind, which occurred very rarely, it 
was rejected and another one-syllable word was selected. Further, words such as ‘and’, ‘of’, etc., 
which have no clear meaning when considered on their own were also rejected. In several cases no 
suitable word was found on the designated page; in these cases words were then considered in exactly 
the same way from the following page. This procedure resulted in the selection of 98 of the 100 words 
used in the experiment (there are 980 pages in the dictionary). The final two were selected by 
choosing two more pages at random from the dictionary and then following the procedure described 
above. The final list contained 100 words ranging from ‘add’ through ‘manse’ to ‘zone’. Some of the 
words selected were extremely common (e.g. ‘chair’, ‘meet’, ‘two’) and some extremely rare (e.g. 
‘crore’, ‘nard’ and ‘surd’). : 

Each word was typed in lower case on a separate 5x3 in index card. The 100 cards were stacked 
in a deck and well shuffled before being presented to each subject. In addition to the shuffled deck of 
cards, each subject was presented with five additional cards. In the Familiarity condition, these cards 
contained the following phrases: ‘Very uncommon words in my experience’, ‘Quite uncommon 
words in my experience’, ‘Words which are neither common nor uncommon in my experience’, 
*Quite common words in my experience' and ' Very common words in my experience'. The five 
additional cards used in the Favourability condition contained the following phrases: ' Words I 
dislike’, ‘Words I rather dislike’, ‘Words I neither like nor dislike’, ‘Words I rather like’ and ‘Words 
I like’. Subjects were tested separately and in each case were simply given these materials and 
requested to sort the words into five piles as indicated (in addition, they were asked to try to put 
roughly equal numbers of cards in each pile if possible). After the subject had completed the sorting, 
the results were transferred to a standardized scoring sheet, and the cards were shuffled for the next 
subject. 


Results 


Mean familiarity and favourability ratings were computed for each of the 100 words, and 
plotted in scattergram form (Fig. 3). Each point can be regarded as fairly robust, since the 
mean ratings are derived from samples of 17 and 16 subjects respectively. 

Visual inspection of the scattergram provides some support for the inverted-U 
relationship. The hypothesized curve rises predictably for words of low familiarity, and 
appears to flatten out at values within the range of approximately 1-5-3-0. The high 
familiarity words show a greater degree of clustering, and there is a tendency for 
favourability ratings to drop at the top of the familiarity scale. This hypothesized 
relationship was tested in three ways. 

(a) Product-moment correlations were computed between familiarity and favourability 
ratings over all 100 words (r = 0-25, 0-05 > P > 0-01); for the 41 words with familiarity 
ratings less than 2-5 (r = 0-47, 0-01 > P > 0-001), and for the 59 words with ratings greater 
than 2-5 (r = —0-27, 0-05 > P > 0-01). The first result is predictable: the overall shape of 
the scattergram would lead us to expect a moderately significant positive correlation. The 
increased value of r for our 41 words of low familiarity provides support for the initially 
rising portion of the inverted-U curve, and the significant negative relationship for the 
words of higher subjective familiarity confirms that there is a fall in the curve within this 
range. Three regression lines have been drawn in Fig. 3 to illustrate these relationships. 
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Figure 3. Scattergram of mean familiarity and favourability ratings for 100 words, with regression 
lines (A) for the whole sample, (B) for those words with familiarity < 2:5 and (C) for those words with 
familiarity > 2-5. 


(b) The analysis of variance technique for testing for linearity of regression of one 
variable on another (McNemar, 1962) was applied to the data, coded into familiarity 
step-intervals of 0-5. Using the variance estimates computed in Table l, we find that the 
correlation ratio is highly significant (eta = 0-49; F = 4-16, d.f. = 7,92, P < 0-001), and 
that the departure of the array means from linearity is also statistically significant 
(F = 3-64, d.f. = 6, 92, 0-01 > P > 0-001). This means that we can confidently assert that 
the relationship between familiarity and favourability departs significantly from linearity. 


Table 1. Analysis of variance table for regression of favourability on familiarity scores 








Sum of Variance 

Source squares d.f. estimate F 
Linear regression 4-11 1 4-11 
Deviation of means from line 12:21 6 2-04 3-64** 
Between-array Means 16-33 7 2°33 4 16*** 
Within arrays 51-11 92 0-56 
Residual from line 63-33 98 0-65 

Total . 67-44 99 





** 0-01 > P0001; *** P < 0-001. 


168 W. Sluckin, A. M. Colman and D. J. Hargreaves 


(c) To gain further information about the nature of this departure from linearity, a 
curvilinear regression analysis was performed on the data (Kerlinger & Pedhazur, 1973). 
This analysis enables us to test the significance of increments in the proportion of the total 
variance successively accounted for by linear, quadratic, cubic, quartic and higher power 
relationships. Since the hypothesized inverted-U function would lead us to expect a 
significant quadratic component, the analysis was performed using the second-degree 
polynomial equation 


FAV = a+b (FAM)-- c (FAM). 


The significance of the incremental variance accounted for by the quadratic component 
was tested by computing 


(Réav FAM pam: — Keay ram)/(k, —k,) 
k,—k,, n—-k,—1)— EI > 
Mirke eR EA | | ene adl Sh x 


where n = number of words, k, and k, = degrees of freedom for Rẹav.ram, rame and 
Riav ram respectively. With RbAv ram, ram = 0°22 and Rhay ram = 0:06, we find that 
F = 19-22, d.f. = 1, 97, P < 0-001: the quadratic component of the relationship between 
familiarity and favourability is highly significant, which suggests support for the inverted-U. 
The proportion of the total variance unexplained by the linear and quadratic 

components = 1 — 0:22 = 0-78: we must now use this as the error term in testing the 
significance of the linear component alone. We find that 





Rt jk 
k,, n—k,—1) = -EA FAME a = 7.43 (0-01 P > 0-001). 
Fk =) (1—Réav ram, rame) /(1—k, — 1) ( ) 
The significance of the linear relationship between the two variables is confirmed, and is 
slightly lower than that of the quadratic component. 


Discussion 


When the stimulus words were roughly split into two groups, the relatively unfamiliar and 
the relatively familiar, liking was found to be positively related to familiarity in the former 
case (as in Zajonc-type studies) and negatively related to familiarity in the latter case (as in 
Cantor-type studies). The function that properly fitted the familiarity-favourability 
relationship over the full range of the familiarity variable was found to be curvilinear, first 
rising and then falling. Thus the result contained both the Zajonc-type and the Cantor-type 
effects, showing them to be complementary rather than contradictory. We undoubtedly 
achieved this by using a very wide spread of the independent variable; and this was made 
possible by the particular experimental procedure adopted. 

The complex dependence of liking for the words used in this experiment on their rated 
familiarity is striking. In particular, several of the very unfamiliar words were quite strongly 
disliked, and many of the words of intermediate familiarity were strongly liked. The 
possibility cannot be ruled out, of course, that correlations between degree of familiarity 
and other variables, e.g. association value and meaningfulness, may mediate the 
relationship we found (Cofer, 1972), and therefore our results could be partly artifactual. , 
Our experimental procedure and method of word selection were such as to render the 
probability of this confounding bias relatively low. 

It is not being suggested, of course, that familiarity is the sole factor which determines 
liking for stimuli. What has been shown in this as in previous studies, however, is that 
familiarity is one important factor. It appears, furthermore, that when a sufficiently wide 
range of the novelty/familiarity continuum is sampled, the characteristic function relating 
familiarity and liking is of the inverted-U type. Theoretical considerations suggest that the 
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parameters of this function may depend upon such factors as the subjective complexity, 
discriminability and orderliness of the stimulus objects. A plausible hypothesis may be that 
an inverted-U-shaped curve obtains 1n all cases, but that simple, highly discriminable and 
ordered stimulus patterns attain peak attractiveness at low levels of familiarity, while 
complex poorly discriminable and unpredictable patterns produce curves whose peaks 
occur at relatively high levels. 

More detailed research is, therefore, needed to test conjectures of this kind. It is not 
impossible that work along these lines could help to account for the apparently haphazard 
way in which fashions wax and wane within a given culture like our own, rather slowly in 
some cases (e.g. classical music), somewhat more quickly in others (e.g. women’s clothes) 
and very rapidly in still others (e.g. pop music). 
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Book reviews 


Cyril Burt, Psychologist. By L.S. Hearnshaw. London: Hodder & Stoughton. Pp. xi+370. 
£8.95. 


In the preface to Cyril Burt, Psychologist Professor Hearnshaw tells us that he started writing the 
biography 1n 1976, some 5 years after Burt's death, feeling admiration for the part he had played in 
the growth of British psychology. Then came the accusation in a Sunday Times article that Burt 
faked his results on intelligence. Sifting through letters and diaries which made no mention of 
research contacts and comparing those with articles in the British Journal of Statistical Psychology 
which implied an accumulation of new research material, Professor Hearnshaw was forced to the 
conclusion that there was justice in the accusations. Thus, what started as a conventional biography 
has turned into something quite different — the unfolding of a psychological case study, with plenty of 
detective work to back up the points made, and a plethora of historical detail to set the scene for the 
tragedy. 

Professor Hearnshaw has done a splendid job in presenting us with a believable portrait of Burt as 
a man who contained potential ‘seeds of destruction’ which grew and blossomed as a result of some 
experiences of malignant fortune. We are shown a young man who was hard-working, 
serious-minded, with high self-esteem and a sense of destiny. He moved from a classical education to 
applied psychology, and thence in middle age to academic psychology. We are told that he got on 
well with his clients and excellently with the powers-that-be, who trusted him and valued his common 
sense. Thus he came to be influential. But the rise to the top of the academic ladder removed him 
from the source of replenishment of his power since he was no longer able to collect raw data. That 
did not matter while he still had reserves of data to work on, but the hoard was almost certainly 
destroyed in the bombing of the Second World War. At about that time the onset of ill health and 
deafness blocked any hope of him personally collecting more data. 

He retired 5 years after the end of the war and even before his retirement he had become a most 
difficult and dictatorial colleague who would brook no rival, real or imagined. After he retired he 
lived for another 21 years. During that long and very busy time a steady descent into falsehood about 
research data seems to have occurred. This ranged from reconstructing details of data from 
summaries which still existed, through writing under pseudonyms which implied the continuing 
existence of a research team, to actual fabrication of figures to fit his own fixed ideas. He also 
overemphasized the pioneering element of his research analysis by distorting historical facts. 

Professor Hearnshaw provides insight into Burt's motivation, and it 1s clear that he did have to 
face serious difficulties of many kinds. However, we cannot excuse a man’s conduct as a scientist on 
the grounds of his personal circumstances, and our disillusionment with Burt as a scientist must be 
very great. Indeed, I think that we must conclude that the objectives of Burt’s life were not primarily 
to advance science and increase knowledge, but rather, to get his own views accepted, and to do so 
using any weapons which came to hand. He put up a tremendous series of fights against anything or 
anyone threatening to lessen his power in any respect. 

It seems to me that the really malignant fortune, for Burt, consisted in outliving the up-to-date 
impact of his pronouncements. For example, The Young Delinquent 1$ still an important book, but by 
the 1950s it had to be read bearing in mind the social context of the 1930s, and not as a modern 
study of delinquency. Burt first published on twins and intelligence in 1943, and again ın 1955, and 
these articles were almost certainly based on genuine data. They were enough to give him a respected 
place in the history of the work on the inheritance of intelligence. Why gild the lily, and with false 
gold? The reason would seem to be that he held his own views most fervently and he wanted to 
smash the opposition with a much stronger case than his real data could provide so long after they 
were collected. 

He has, 1n the long run, harmed his case and shattered his reputation. Unfortunately he has also 
harmed the reputation of the science of psychology, and all of us will have to pay for his scientific 
sins by a lessening of our professional trust (though one must grimly point to a nced for that, 
nowadays, regardless of Burt). Professor Hearnshaw has written with care. He has not glossed over 
uncertainties. He has shown proper scientific detachment in his assessments yet has written with 
compassion The result is a gripping book. The facts are set out: posterity is left to judge whether 
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Burt did more harm after his retirement than good in his pioneering days. As far as our science is 
concerned, my opinion is that Shakespeare had the words for it: 

‘The evil that men do lives after them, 

The good is oft interred with their bones’. 
SHEILA CHOWN 


By the time this commentary appears, those interested in ‘the Burt affair’ will have learned that the 
allegations of fraud made with our public support in 1976 by Oliver Gillie have been substantiated. 
But there is far more material in this well-researched, scholarly and readable biography than a careful 
evaluation of Burt’s deceptions. It will make absorbing reading for all those interested in the 
development of psychology in Britain. 

This book had an extraordinary history: it was originally commissioned shortly after Burt’s death 
in 1971 by his devoted sister, Dr Marion Burt, and was undertaken by a distinguished historian of 
British psychology who, although neither a pupil nor colleague, had earlier written favourably about 
Burt’s work and influence. Fortunately he delayed his task until after retirement in 1975, by which 
time Kamin’s iconoclastic evaluation was widely known. However, the subsequent allegations of 
fraud necessitated a research programme never previously envisaged, which perforce included an 
attempt to account for Burt’s psychopathology. Leslie Hearnshaw found himself as judge and jury in 
a case where the accused was internationally known for his work on factor analysis and heritability of 
IQ, had been knighted for his public service as an applied psychologist, and had as professor at 
University College, London, influenced many students who later themselves achieved eminence. 
Furthermore, he had apparently, with the help of numerous assistants, gathered unique data on the 
intelligence of relatives which were increasingly referred to by biometricians, and which provided a 
substantial base for genetic interpretations of group and racial differences, particularly in the USA 
Many distinguished scientists were publicly associated with Burt’s work, and regarded the allegations 
of fraud as both lacking in substance and politically motivated. Naturally they hoped, indeed 
expected, that a careful examination of the relevant material, including diaries and letters, by a 
historian of psychology known to be biased in Burt's favour would result in a verdict of not guilty. 
Such a verdict would have removed from them the otherwise inescapable conclusion that they had 
failed to perceive the internal evidence for fraud contained within Burt's seminal writings, and the 
consequent charge of either superficiality or bias. ; 

Burt has been described as a polymath of Renaissance dimensions, the account of his childhood 
and youth contains evidence of many of the characteristics of the gifted, including wide-ranging 
academic and cultural interests, which he cultivated to the end of his days. His unfortunate 
biographer, in order to do him justice, had to chase this Galtonian butterfly through 88 years of 
frenetic activity, during the last 62 of which he was seriously involved in test standardization; 
delinquency; maladjustment, subnormality; giftedness; vocational guidance and selection; aesthetics; 
the development of factor analysis; typography, ESP, biometrics and genetics. Art, music and the 
pursuit of Hebrew he retained as hobbies. 

It was not only the question of fraud that had to be ovestigated: for there were also charges 
against Burt of many other malpractices. Hearnshaw knew he had to be, and to be seen to be, 
impartial. Furthermore, as those in touch with hum discovered, he was not to be hurried in reaching 
conclusions or 1n making them public. Justice was to be done, and the evidence relating to the 
extensive deceptions was to appear for the first time in the context of what the biographer and others 
perceive as Burt's many positive contributions to psychology, together with what looks like a plea for 
a verdict of diminished responsibility. On the issue of deception Hearnshaw concludes that beyond 
reasonable doubt Burt (1) falsified the early history of factor analysis, denigrating Spearman after his 
death and alleging that he was the real pioneer; (2) produced spurious data of MZ twins; and (3) 
fabricated figures on declining levels of scholastic achievement. ‘Other material on kinship 
correlations is distinctly suspect’. 

The twin studies are perhaps the most florid of Burt’s deceptions. In particular the MZ twins 
reared apart formed an ever-enlarging sample reaching 53 pairs ın 1966, out of a world total of 122. 
Not only was this sample unique in size, but also in degree of intra-pair environmental difference. 
Burt’s alleged collaborators, Conway and Howard, were completely unknown to those closest to him, 
including his personal secretary. Cohen thought he remembered Howard in the late 1930s; MacRae 
met her twice, probably in 1949; Hammond (one of Burt's students who happened to be a twin) 
suggested in 1976 that she had assessed him at Aberystwyth during the war, and that ‘another lady 
whose name I knew began with * C"' was working on the same project. If so, this was not part of 
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any twin study, as incorrectly implied by the biographer (p. 243). But by July 1979, Hammond 
(personal communication) had a different recollection; it was now Miss Conway who had tested him. 
In any event, these fragmentary memories are of little relevance for neither Conway nor Howard were 
later to be found in the right place and at the right time (Clarke & Clarke, 1977; Gille, 1979) doing 
the things Burt alleged. 

Unlike Newman, Freeman & Holzinger (1937) and Shields (1962), Burt never published case 
histories of his separated twins, and when approached by scholars such as the latter, said that Miss 
Conway would have them (Shields, 1978). Moreover, when in 1968 Jencks asked for the basic data 
on the IQs and social class of the 53 MZA twin pairs, Burt took 7 weeks to reply, lied on three 
counts in his apology for the delay and had spent the whole of one week 'calculating data on twins 
for Jencks' (p. 247). As Hearnshaw notes, ' Had the IQ scores and social class gradings been 
available, they could have been copied out in half an hour at the most'. It seems therefore that our 
assumption in 1976 that Burt's method of fraud had been to work backwards from his correlations 
and invent data to fit them, was probably correct. 

Although Burt had earlier in life expressed an interest 1n twins and had later used his influence to 
obtain facilities for Cattell, Herrman and Hogben to study some, it seems possible — even probable — 
that Burt himself never had any systematic data of his own at all. We note that, in a letter to 
University College during the war he catalogues an array of different material he had left stored in 
the basement (p. 249). The most important and unique, on twins, 1s never mentioned, nor does it 
seem to have been transported to Aberystwyth. Why not? Almost certainly because it did not exist. 
Furthermore, Sutherland & Sharp (1979) have suggested difficulties in accepting that Burt had even 
gathered data on large samples of twins reared together. 

*Burt was an extremely introverted, extremely private person, who rarely expressed his feelings to 
others. ..He hardly ever displayed anger and maintained a devastating politeness even when engaged 
in controversy.' He married very late in life, and the relationship was unsuccessful. In certain contexts 
he displayed kindnes, generosity and humour, particularly in the company of children, women or 
small groups of students; 1n others he was arrogant, paranoid and unscrupulous. Throughout his life 
there was an inwardness about Burt which detached him from close social contacts; there was also a 
driving ambition to be acknowledged as first, coupled with vast erudition and an extraordinary 
verbal facility which included a subtle use of ambiguous statements. According to the biographer, 
Burt ‘was not, perhaps, either by training or temperament a scientist. He was too impatient to reach, 
and too confident of having reached, firm conclusions which became for him very early in his career, 
articles of almost religious faith to be defended at all costs’. Commenting on Burt's first article (1909) 
Hearnshaw notes ‘Inadequate reporting and incautious conclusions mark this first incursion of Burt 
into the genetic field. We have here right at the beginning of his career, the seeds of later troubles.’ 
Perhaps he deceived himself. This we shall never know, but from the evidence presented in this book 
and from Burt's own writing we believe it likely that numerous malpractices including his vast 
propensity for deceiving others started early rather than late in life, although certainly they became 
exaggerated in later years. It is a pity that this man, who cared so little for the principles of empirical 
science, should have influenced the development of applied psychology in Britain. It was not that he 
lacked good contemporary models; as one example, a symposium on vocational guidance published 
in this Journal (1924) contains a closely reasoned argument by Thurstone for the essential 
methodological criteria on which vocational guidance might be based; it can be read with profit 
today. In contrast, Burt's paper seems impressionistic and mediocre. His use of short cuts, to which 
Hearnshaw refers, suggests a pervasive superficiality. For example, in another 1924 report Burt claims 
personally to have assessed 562 backward children in four weeks, at the rate of a little over 10 
minutes per child, using mainly the Binet-Simon test. 

Among several problems that remain there is the important matter of deleting all references to 
Burt’s ‘empirical’ work on the hentability of IQ, kinship correlations for intelligence and for social 
mobility from all texts and tables in books and articles, some perhaps in press. This must also extend 
to some of the arguments advanced by those such as A. R. Jensen and H. J. Eysenck who have relied 
heavily on Burt's findings. A second, less urgent enterprise is for historians to extend the evaluation 
of Burt so ably initiated by Professor Hearnshaw. There is also a third, more painful issue. Some of 
Burt's many malpractices were known during his life time. Not only were his seminal articles lacking 
in essentia] details, but were riddled with ambiguities and internal discrepancies. He received many 
inquiries from scholars seeking elucidation, and apparently always succeeded in deceiving them. That 
his frauds escaped public exposure for so long is a disturbing event in the history of science. 

ANN M. CLARKE 
A. D. B. CLARKE 
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Did he or didn't he? The salacious or merely curious reader of Professor Hearnshaw's biography of 
Cyril Burt may turn immediately to Ch 12 for an answer. In this chapter, cautiously entitled 
‘Posthumous controversies’, Hearnshaw concludes that Burt did. He did fabricate his data; he 
evaded requests to see those data, telling at different times totally inconsistent stories, most of which 
were patently false; he filled the journal he edited with articles written by himself under false names, 
which supported his views and attacked those of his critics; and he falsified the early history of factor 
analysis to Spearman’s disadvantage and his own credit. 

The allegations of fraud made against Burt after his death, and after Hearnshaw had agreed to his 
sister’s request that he should write this biography, have no doubt changed its intended character, 
but have also surely added much interest to its contents. In retrospect, it is clear that Hearnshaw has 
proved the ideal author. Only a man as determinedly fair as he, whose assessment of Burt and his 
work when first asked to write the book ‘was almost wholly favourable’, and who can still at the end 
of it write. ‘It would be totally unfair for a final judgement on Burt to focus on his deceptions to the 
exclusion of all his positive achievements’; only such a man will carry conviction when he concludes 
that ‘beyond reasonable doubt’ the changes of fraud are entirely justified. 

This 1s the work of an historian of British psychology, and those who find this a fascinating topic 
will find much to be fascinated by. It is a sedate and measured book, with many pleasing touches, 
much interesting information, and a leisurely, pleasingly old-fashioned air. It takes us through Burt’s 
childhood, education and early career, and paints a marvellous picture of Burt in retirement, writing 
an average of 10 or so papers a year during the 21 years to his death (to say nothing of all those 
other papers by fictitious contributors to the British Journal of Statistical Psychology), as well as 
examining, reviewing hundreds of manuscripts for publishers, editing a journal, maintaining a 
voluminous correspondence, and writing a diary. 

The diary, however, proved his undoing, for it provides Professor Hearnshaw with some of his most 
telling evidence of fraud. From 1951, the year of hus retirement from University College, Burt 
published alone, in collaboration with Margaret Howard, or more simply under the pseudonym of 
Jane Conway, some half-dozen articles on the IQ of JAZ twins. These were represented as following up 
his 1943 paper which reported, inter alia, the data from 15 pairs of separated MZ twins. In 1955, this 
sample had increased to 21 pairs, in 1958 to 42, and by 1966 to 53 pairs. During this time, 
Hearnshaw shows, Burt collected no data himself, had no research assistants working for him 
collecting such data, and had no contact with either Miss Howard or Miss Conwzy, two ladies whom 
nobody appears to have set eyes on since 1943. In spite of remarks such as those written in 1960, that 
*for a more conclusive answer...we must I imagine await the results of Miss Conway and others who 
are applying tests of various abilities to twins who have been brought up separately from birth’, no 
such data were being collected. Hearnshaw concludes that if Burt ever did have data on the IQ of 
MZ twins, they were collected before the war and destroyed by enemy action during it. 

Although by now the most celebrated example, the twin data are not the only instance of 
malpractice or fraud that Hearnshaw documents. Burt's study of social class and the resemblance in 
IQ between parents and their offspring is regarded as ‘a dubious exercise’, and his claim, based on 
assessments supposedly conducted at intervals between 1914 and 1965, that educational standards 
were falling, is said to use data that are ‘at least in part fabricated’ 

It is perhaps irresistibly tempting to ask why Burt did ıt Certainly Professor Hearnshaw does not 
resist the temptation, and Ch. 13, ‘The Man’, attempts a little psychological analysis. Perhaps this is 
only fair, and we should not complain if one psychologist, writing the biography of another, bandies 
about phrases like ‘mechanisms of defence’ or ‘marginally paranoid personality’, and delves back 
into his subject’s childhood for clues to his behaviour at the age of 70. I am more persuaded by an 
earlier, more casual suggestion, that the papers on MZ twins were written in response to what Burt 
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regarded as ill-informed, politically motivated, sociological criticism. What better way to prove his 
point than to show that psychologists were busy collecting empirical data to settle the question while 
sociologists were simply airing their prejudices? 

In the end, however, this is not the most important question, and if there is a a fault i in Hearnshaw's 
book it is that he takes his duty as a biographer too seriously and as a scientist not seriously enough. 
There is very much more scientific discussion of Burt's work in Hearnshaw's book than I have 
suggested, but it is not always as incisive or as well informed as one might like. And Hearnshaw 
completely ignores a question of some general scientific concern: why were Burt's data accepted for 
so long? Ignoring the question of fraud, the fact of the matter is that the crucial evidence that his 
data on IQ are scientifically unacceptable does not depend on any examination of Burt's diaries or 
correspondence. It is to be found in the data themselves. The evidence was there, as Dorfman has 
now shown, in 1961. It was, indeed, clear to anyone with eyes to see in 1958. But it was not seen 
until 1972, when Kamin first pointed to Burt's totally inadequate reporting of his data and to the 
impossible consistencies in his correlation coefficients. Until then, the data were cited with respect 
bordering on reverence, as the most telling proof of the heritability of IQ. It is a sorry comment on 
the wider scientific community that ‘numbers. . .simply not worthy of our current scientific 
attention’, in.Kamin's fairly restrained phrase, should have entered nearly every psychological 
textbook, and been used without question in every discussion of this issue. 

But Professor Hearnshaw must have the last word, for his final judgement is a masterpiece of 
irony: * Burt may yet be accorded a place in history as one of psychology's imaginative pioneers' 

N. J. MACKINTOSH i 

Although histories of modern psychology are often (too often) short personal and intellectual 

biographies of the leading figures, and although we have the informative series of short 

autobiographies initiated by Murchison, we have altogether, too few full length biographies of leading 
psychologists. We have learned a great deal about the psychology of disturbed persons from the case 

histories that have been compiled. Properly done biographies of notable psychologists could well 

throw light on the psychology of psychologizing. 

Hearnshaw’s biography of Cyril Burt is a sound, balanced, thorough, scholarly piece of work. It is 
based on the examination of various sorts of evidence (mainly written but some oral) such as Burt’s 
own voluminous writing on an extensive range of psychological topics, Burt’s diaries and notes when 
they are available, and reports by others who were acquainted with Burt. One presumes that 
Hearnshaw began his work for this biography with no more thought than bringing his historical and 
psychological skills to the task of showing how one of Britain's most notable 20th century 
psychologists had developed as he did. He must have known, though not a pupil of Burt's, of some of 
the latter's quirkishness. As an Antipodean who met Burt only once (in 1952 and quite informally) 
but who had been reading from 1931 onwards much of his work, I formed the following opinions: he 
was a very learned man who irritatingly paraded his learning, a man with a tremendous range and 
depth of intellectual penetration but who was patently so vain, a man of great originality who was 
painfully assertive about his own priority in various matters. I felt that he was not fairly representing 
his róle in the development of factor analytic methods (in his wntings from 1940 onwards), but I 
attributed this to his vanity. It never occurred to me that he might be a cheat. Nor do I 1magine this 
occurred to Hearnshaw much before 1973, 2 years after Burt's death. Around 1972-1975, Kamin, 
Jensen and the Clarkes, however, were beginning to call into question Burt's correlation kinship data, 
which had earlier been so widely cited by the hereditarians i in their arguments on the inheritance of 
intelligence. 

Hearnshaw traces Burt's family background, his education, the formation of his psychological 
views, the areas of and the nature of his psychological contributions, his skill as a university teacher 
both undergraduate and postgraduate, and his fluency and productiveness as a writer of scholarly 
books, papers and letters (indeed it would not be unfair to say that Burt was a compulsive writer), A 
very clear picture of Burt the man comes through. It is.a picture of contrasting whites and blacks; he 
was generous to intellectual inferiors but. absurdly jealous of possible rivals, a clear thinker but ready 
to distort a point in order to win an argument, a seeker of admiration but shy and almost paranoid 
about what others thought of him. 

Though Burt had gained great competence in mathematical techniques, his education was in 
classics, philosophy and somewhat primitive Oxford psychology, and his early experience in 
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psychology was practical and clinical. An applied psychologist using psychometric and interview 
methods has to be prepared to use quite often rather rough-and-ready data. Burt made many 
important contributions to the techniques of producing better psychometric data. Unfortunately when 
through his own unaided efforts he acquired his later skills in mathematical techniques, he used these 
razors to chop the rough blocks he had earlier acquired. Hearnshaw suggests that Burt’s deviation 
from the best standards of scholarly work began about 1940, and in a clinical way he suggests 
reasons and causes for this deviation. I agree with the dating of the deviation, but I am left 
unconvinced by Hearnshaw's account of the psychodynamics; though he penetrates quite deeply into 
Burt's psyche I doubt that he penetrates deeply enough. 

Hearnshaw appears to me to be very fair in drawing up a balance sheet of Burt's strengths and 
weakness. On the one side are his contributions to test-construction and test-use, to the 
understanding of juvenile deliquency, to the need for differential educational treatment of the dull 
and the bright, and to the early recognition of ‘group’ factors additional to Spearman’s ‘general’ and 
‘specific’ factors. On the other side are his vanity, his determination to win every argument, and his 
compulsive writing. These combined in leading him to make what are patently false claims about his 
own priority in respect of factor analysis, to produce either faked or carelessly reproduced kinship 
data, and to present, as though they contained observed data, ‘illustrative’ quantities derived from a 
theory he was wishing to defend. 

This splendid biography of a now controversial figure retains a balance and fairmindedness that 
biographers of controversial figures find difficult to sustain. It 1s singularly free of errors of detail; I 
found two spelling errors (one repeated) one error of historical fact and one error in a page reference 
in the index. These are too trivial to be specified and are perhaps a product of my close scrutiny of 
Burt's reported kinship study (done while I was awaiting the delayed receipt of Hearnshaw's book). 

Among the appendices are a valuable bibliography of Burt's publications and of the candidates by 
calendar year who gained Master's degrees or doctorates under Burt's supervision when he was a 
part-time Professor at the Day Training College or during his regime at University College London. 
W.M O'NEIL 


The Ecological Approach to Visual Perception. By J. J. Gibson. London: Houghton Mifflin. 1979. Pp. 
332. £13 50 


Particularly since the revolutionary discoveries of Hubel and Wiesel a couple of decades ago there has 
been a marked tendency for students of vision (myself included) to concentrate upon mechanisms 
rather than on processes, upon the analysis of features rather than upon the synthesis of percepts. In 
contrast, for over three decades Gibson has conducted an almost single-handed crusade against this 
strategy, and has been zealous in the promotion of his own creed, which is based upon the notion of 
‘ecological optics’. 

Gibson’s view is that we should start not with the retinal image but with the optic array, the 
structured pattern of light reflected from the environment which, because of the fundamental laws of 
optics, bears a fixed and specifiable relationship to the nature of that environment. It is not merely 
the static properties of this array, but the nature of the transformations it undergoes when either the 
objects or the observer move within the environment, that constitutes the ‘ecological’ optical 
information. 

This information, because of the ‘invariances’ of the array and its transformations, is, according to 
Gibson, directly accessible to the observer. It 1s ‘picked up’ and requires no further processing: the 
percept is given by the structure of the information already present ‘in the light’. Examples of such 
ecological information are texture ‘gradients’, the flow patterns of the visual field generated by 
movement of the observer through the world, and the way in which previously hidden parts of an 
object are revealed as one moves around it. 

I described Gibson’s work as a ‘crusade’, and it is a metaphor that extends itself very easily. 
Evangelists are rarely noted for their reticence, their balance, or their economy of style, and this is 
unquestionably an evangelizing book. ‘It is not necessary to assume that anything whatever 1s 
transmitted along the optic nerve in the activity of perception.’ ‘...most of what has been written 
about pictures and images over the centuries 1s misleading, or hopelessly vague. We should forget it 
all and start afresh’ — on Gibsonian lines, of course. Gibson appears to believe that his approach is so 
radical, and other ways of looking at things is fundamentally wrong, that everything else should go. 
This is nonsense of course. The fact of the matter is that Gibson has indeed made a major 
contribution by asserting the need for new, much richer, descriptions of the visual information 
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available from the environment, and has made several extremely valuable contributions in this respect 
himself. But his approach requires merely a change of emphasis, not one of direction, and this can 
easily be incorporated into current orthodoxy. 

I have deliberately avoided referring to Gibson’s ‘theory’, since his way of looking at things does 
not constitute a theory; certainly not in the way that, for example, Marr’s much more rigorous 
attack on a similar set of problems constitutes a theory. It seems to me that for Gibson the sole 
problem of perception lies in a careful description of environmental optics; this is surely a necessary, 
but by no means a sufficient, basis for an explanation of the process by which man becomes aware of 
his environment. 

In this book, this rather undisciplined and self-indulgent book, Gibson takes an unconscionably 
long time to make each point. (‘The earth—air interface is. ..the most important of all surfaces for 
terrestrial animals. This is the ground.’) However, there really is not an awful lot in this book that 
was not in his original (1950) The Perception of the Visual World, which was a great deal more 
readable. The present book — with its massive use of the personal pronoun, its frequent and often 
gratuitous neologizing, and its staggering over-use of italics for emphasis (most pages contain three or 
four italicized words; page 108 is the only one I noticed that was totally devoid of italics) — 1s a very 
irritating one to read, and I am not at all convinced that ıt repays the effort required. 

' Despite the fact that Gibson's book has received the imprimatur from no less a figure than Neisser 
himself, much of the non-analytic content appears to me to be idiosyncratic, mystical, and largely 
untestable. The most mystical concept of all is the notion of ‘picking up’ environmental information. 
What can this mean? I can put no reasonable construction on the phrase without referring to the 
observer’s sensitivity to transformations in his retinal image, yet Gibson talks almost as though the 
mere existence of ecological information ts enough to generate a percept, even in the absence of a 
retinal image or an optic nerve, by some mysterious process of optical osmosis. 

It is probably clear to the reader that I found this a profoundly frustrating book to read. It is also 
the most troublesome book to review: precisely because of its idiosyncratic nature it is very difficult 
to judge the extent to which one's own reaction to it is likely to be representative. There is, I suppose, 
the chance that some might find the book stimulating and provocative; I found ıt merely provoking. 
B. J. MOULDEN 


Knowledge and Development. Volume 2. Piaget and Education. Edited by J. M. Gallagher & J. A. 
Easley. New York: Plenum. 1978. 


This book comprises eight articles which examine the applicability of Piaget's theory to education. 
Topics covered are general methodology, moral development, early education, science curricula, 
mathematics, reading and exceptional children. 

A reading of the book gives a rather gloomy view of the present state of the art and raises more 
questions than it answers. The general tone of the book is circumscribed by the fact that Piaget's 
intention was to produce a theory of knowledge in terms of its genesis and not an exercise in 
pedagogy; hence educators must look to it for implications rather than applications. To date, it 
would seem that this has resulted in many misapplications, trivializations and even the development 
of ideas which are directly at variance with Piaget's position. 

Two very different aspects of Piaget's theory have attracted the attention - the educators: 

(a) Equilibration (or the motor for cognitive development). This is the process whereby the child 
‘acts’ upon objects or ideas, and the discrepancies between what is anticipated and what actually 
happens triggers the child into restructuring his ideas into a more sophisticated system. The term 
used for this general approach throughout the book is constructivism. Gallagher points out the a 
problems raised by simplistic interpretation of this approach (‘I do and I understand’) which result in 
resemblances to the traditional approach to nursery schools and to self-discovery learning 
programmes; and that it can only be innovative when linked to the concept of ' Reflexive 
abstraction’. De Vries and Easley develop this idea with respect to early education and mathematical 
understanding. 

(b) Stage/structure or the division of thinking into certain stages which must be negotiated in an 
invariant order, and which can be characterized by certain logical forms (e.g. the formal operations 
stage which can be described in terms of the 16 binary operations and Kleins Viergruppe). 
Theoretically all problems can be analysed in terms of the underlying logical structure and levels of 
sophistication which should predict whether child (a) can solve problem (6). (Lovell and Shayer 
illustrate this with scientific concepts.) However, as Pascual-Leone et al. point out, Piaget's concept of 
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‘horizontal décalage’ allows for the fact that some problems are easier than others with a similar 
logical structure depending on context, presentation, etc.; thus there is little possibility for predicting 
positive transfer from one case to another. They claim, further, that the stage theory invites 
approaches in which the stages are reduced to logical structures, which are further reduced to 
particular Piagetian tasks, with the idea that training on these tasks will promote intellectual 
development, ‘and this derivation does not follow from the basic tenets of Piaget’s genetic 
epistemology’. 

How, one might ask, has the chaos resulted among thinking researchers and educators? There 
appear to be:three main reasons for this and references to these constitute a consistent theme in the 
book. 

(a) Piaget's concerns, i.e. the topics treated and the mechanisms for change (or learning), are not 
congruent with those of the traditional educators. For Piaget logical thinking is often exemplified in 
‘scientific’ experiments those to do with the mechanics, balancing, floating objects, etc., while 
traditional science education is concerned with many other scientific ‘facts’; further, the humanities 
may be concerned with a logic which is applied to facts of a very different nature. Secondly, 
traditional education, in many cases, may pay lip-service to insightful learning, while continuing 
covertly, to promote rote learning. 

(b) Piaget is constantly changing and developing his theory and consequently the gap between 
Piaget's critics and friends (as Sigel pointed out in the Introduction) remains constant. Easley's article 
makes this explicit with regard to conservation and the ‘envelope’ borrowed from category theory in 
modern mathematics (which was new to me). 

(c) Piaget's theory ts not specific enough to be applied directly to education in its pure form. As 
stated above the concept of ‘horizontal décalage’ explains everything and predicts nothing to a degree 
that it rivals the Freudian concepts of defence mechanisms. Two possibilities of overcoming this 
have been evolved. 

(i) Develop a mini-theory within the Piagetian framework. Pascual-Leone and his co-workers have - 
tried to do this 

(i) Develop a taxonomy of tasks and variables affecting them in order to predict/enhance 
performance. However a reading of Easley's article underlines the fact that four decades of research 
into conservation have failed to do that. 

Either way the result may not be totally reconcilable with Piaget's theory since the employment of 
specific objectives and associated strategies are, to say the least, contrary to the self-discovery, 
constructivist aspect of Piaget's theory. Possibly we have to accept that it is a contradiction in terms 
to apply Piaget's theory; and that if we must find applications, we should look to it as a fruitful 
source of ideas while realizing we do violence to the spirit of his work. 

What then is positive about this book? 

It is a careful re-examination of Piaget's position with regard to the topics listed at the beginning of 
this review. In some cases there are concrete suggestions as to how Piaget's ideas might be used 
(Lovell & Shayer on science, Lickona on morahty); except in the case of the latter two topics this is 
not a book for teachers looking to Piaget for practical ideas. Neither is it wholly a book for 
researchers in the field looking for literature reviews or sources of inspiration, though they might find 
articles by Pascual-Leone et al. and Easley useful. It seems to be more concerned with the problems 
and pitfalls in the applicability of ‘psychological’ theory to education in general and of Piaget's in 
particular. As such it should be of interest to theoretically minded educators or Piagetian-minded 
psychologists with a conscience about the uselessness/fulness of their research; but not, to those who 
are totally unfamiliar with Piaget's theory. 

VERONICA LAXON 


A Functional Approach to Child Language: A Study of Determiners and Reference. By A. Karmiloff- 
Smith. Cambridge: Cambridge University Press. 1979. Pp. 258. £15.00. 


This book is one of the latest Cambridge Studies in Linguistics, a series which has included several 
significant books on child language. Karmiloff-Smith's contribution follows this tradition. It begins 
with an analysis of Piaget's views on the relation between language and cognitive development. 
Karmiloff-Smith argues that language has been unjustifiably relegated to a secondary position in 
Piagetian theory, particularly during the first eight years of life. In particular, Piaget has 
underestimated the importance of language as an experimental variable in cognitive tasks, and as a 
factor in child development generally. He has also ignored the fact that the child has to learn to deal 
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with many problems which are specifically linguistic. The book re-examines the role of language in 
cognitive development by presenting a specific hypothesis about the evolving comprehension of noun 
determiners, particularly the definite and indefinite article. Articles are of particular interest because 
although they appear early in child speech, they are plurifunctional in adult speech and thus take the 
child some considerable time to understand completely. Karmiloff-Smith is concerned not so much 
with when they appear in child speech but with how their functions gradually change for the child. 
Although these studies were carried out with French children, with the exception of the experiments 
on gender, the results are certain relevant to English where a parallel system of plurifunctionality 
exists. 

The studies are prefaced with a discussion of one of the most difficult problems facing those who 
study child language in an experimental setting. On the one hand, if we present sentences in their 
normal extralinguistic and discourse context, it is impossible to tell whether the child actually 
understands the grammatical form or is relying on cues from the context to arnve at a plausible 
interpretation. On the other hand, if we remove all such sources of information, we can never be sure 
that we have observed anything but a procedure which the child has adopted to deal with our 
experiment. Karmiloff-Smith's approach to this dilemma 1s to carry out a variety of comprehension 
and production experiments using semi-standardized items together with an exploratory approach in 
which she questions her subjects about the responses they make. A large part of the book is devoted 
to describing these studies and children's comments are quoted in length where they shed light on 
their interpretation of the experimental situation. On reading the studies, it becomes clear that 
questioning the children was a valuable exercise because identical responses can occur for completely 
different reasons. In particular, it soon emerges that putting the obvious adult interpretation on the 
child's performance can be totally misleading. 

Karmiloff-Smith's general conclusion ıs that while linguistic development 1s clearly affected by 
cognitive development as far as general mechanisms go, many of the problems which the child has to 
deal with are specifically linguistic. In particular, the fact that words are plurifunctional presents 
particular difficulty. She argues that there are three stages in learning to deal with plurifunctional 
words. Initially, each word is treated as having only one function and other functions are not 
expressed. Later, children become aware that a word may serve more than one function but they do 
not realize that the same word can serve several functions simultaneously. They therefore use a 
separate morpheme for each function they wish to convey. The final stage 1s reached at about 8 years 
when children endow words with plurifunctional status. 

Using this model, Karmiloff-Smith re-examines some of Piaget's data on the concept of identity 
She argues that although Piaget is cautious about interpreting similar responses at two different 
developmental levels as having the same cognitive status, he ignores the fact that different verbal 
responses may have the same conceptual status. She concludes that language is an important 
experimental variable in cognitive tasks, a fact which Piagetian research has tended to underestimate. 

I found this a thought-provoking book which raises several critical issues about both the 
methodology and findings of research into linguistic and cognitive development. The arguments are 
cogently made and, on the whole, compelling. My only slight criticism is that possibly too much 
detail about linguistic theories of determiners is given, particularly since the experiments were based 
on the author's analysis of adult output rather than systemic grammar. However, those who are not 
concerned with such theories may miss this section without loss. I am sure many will wish to study, 
this book, which should be essential reading for those researching 1nto linguistic development. 
MARGARET HARRIS 


Emotions in Personality and Psychopathology. Edited by Carroll E. Izard. New York: Plenum. 1979. 
Pp. xx 4- 597. $18.58. 


The time is long overdue to begin asking some different questions about the nature of emotions; what 
an emotion is, or how many distinct emotions there might be have directed research for the last 100 
years. À more topical question 1s how emotions affect our behaviour and our thinking. In posing the 
more general question, the study of emotion loses its insularity and becomes integrated with other 
areas 1n psychology Emotions in Personality and Psychopathology is a collection of papers attempting 
to answer this broader question. 

The contributions encompass a tremendous range of topics tackled from a variety of theoretical 
viewpoints. They are loosely organized into three parts: moods, traits and defence mechanisms, pain, 
anxiety, grief and depression; emotion awareness, expression and arousal. With 19 separate 
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contributions, 1t is impossible to comment on each one and only a fragmentary impression of this rich 
book will be possible.The editor has attempted to establish a sense of unity and continuity by 
providing an introduction outlining general themes and prefacing each chapter with a brief summary 
of its key points. Despite his efforts, the book's heterogeneity shines through and emerges as one of 
its attractions. 

The book begins with two chapters on positive emotions: Singer's on play and fantasy stresses the 
happiness that imaginative play can bring and its significance for normal development; Levine 
discusses humour and its abnormal manifestations in certain psychopathologies. After this relatively 
cheerful start, the emphasis shifts to the more negative emotions for the rest of the book. For those 
new to the contributors! work, there will be some surprises. For example, Pilkonis & Zimbardo tell 
us their survey suggests that 40 per cent of the American college population describe themselves as 
‘shy persons’ and the Number One fear for US inhabitants, which looms larger even than fear of 
sickness and death, is the fear of speaking before a group. Another surprise is in Wessman's chapter 
on moods: he describes a longitudinal diary study which suggests that for most women, their mood is 
predominantly affected by external events and is not related to their menstrual cycle. 

Two chapters deserve special mention The first is Leventhal & Everhart's on emotion, pain and 
physical illness. They present a parallel processing model of pain in which the degree of pain distress 
is regarded as a function of both the noxious sensory information and a distress-emotional 
component, both being processed simultaneously and contributing to the total pain experience. The 
degree of pain distress is affected by memories of previous pain distress to similar noxious stimuli 
which are stored in pain schemata and activated on the recurrence of the sensory information. Pain 
schemata are believed to provide an explanation for the mystery of phantcm limb pain. This chapter 
has been singled out because it 1s the best example of a successful attempt to realize the main aim of 
the book, namely to explore the part emotion plays in a particular domain. A theory of pain distress 
has been proposed in which the concepts of sensory information, emotion and pain are kept distinct 
yet all contribute in specified ways to the degree of pain experienced. In several of the other chapters, 
there is a tendency to equate emotion with the particular area of study, rather than propose the 
relation between the two. This is particularly noticeable in the personality domain where the 
distinction between enduring personality traits and transitory emotional states has always been 
problematic For example, in Mosher's chapter on guilt, he tries to avoid the problem by arguing 
that guilt is a disposition which determines the intensity of the affective experience of guilt for a given 
person in a particular situation. Zuckerman, in a highly complex chapter, proposed that 
sensation-seeking be separately measured in both its trait and state forms. 

The other chapter deserving particular mention is Maslach's critique of Schachter & Singer's classic 
study and her failure to replicate their findings. This contribution is one of the most novel in the 
collection. She presents evidence to support her theory that unexplained. undifferentiated arousal 
should not be regarded as emotionally neutral, but is in fact more likely to be experienced as a 
negative emotion, irrespective of the cognitive cues provided by the environment. Maslach’s work 1s 
particularly welcome in this collection since it has not yet appeared in the journals, although it is due 
to do so. Many of the contributors have published elsewhere (e.g. Beck, Exline, Gray, Lewis and 
Scherer, in addition to the authors already mentioned), and, while their chapters provide an excellent 
introduction to their ideas and in some cases report new data. they will be of lesser interest to readers 
already familiar with their work. 

Since the chapters contain little background information on emotion, personality or 
psychopathology, this book will be of interest mainly for readers at graduate level and beyond, 
although it could also be used as the key text for a final year specialism for undergraduates. It is an 
innovative book, amounting to a substantial step towards the realization of the editor's goal of 
achieving a science of the emotions. 

SARAH E. HAMPSON 


Personality Theories: An Introduction. By B. Engler. Boston: Houghton Mifflin 1979. Pp. xvi+511. 
£10.50 


This textbook is aimed specifically at the undergraduate requiring an introduction to personality 
theories. Its stated objectives are to describe and evaluate a wide range of theories and to indicate 
how they may be applied, both in a psychotherapeutic context and in the enhancement of the reader's 
self-understanding. While the introductory student will find it a useful source for brief outlines of 
otherwise voluminous personality theories, the teacher of personality courses hoping for a new 
approach will be disappointed. 
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The opening chapter's title (“What kind of games are these?’) is made clear when the language 
game of philosophy is distinguished from the language game of science and the reader is warned that 
personality theories play both. There is a short section on psychotherapy, which is described as an 
‘art’, and the scientific evaluation of psychotherapy is dismissed as being insensitive to the subtler 
nuances of psychoanalytic insight. Thus the book reveals its basic bias: since personality theories are 
both philosophical and scientific, 1t is inappropriate to evaluate them and their applications on purely 
scientific grounds. As a consequence of this position, the evaluative sections in the remainder of the 
book are short and uncritical. For example, the large body of research on the scientific study of 
Freudian theory 1s merely noted in a single sentence. The book proceeds with two chapters on 
Freud's theory of personality and psychoanalysis. This is followed by neo- and social psychoanalytic 
theories, learning and trait theories, humanistic, cognitive and, finally, Eastern theories. In all, 17 
theories are described in depth and several others are briefly mentioned Each theory 1s prefaced by a 
short biography of the theorist who is also illustrated by a delicate pencil portrait. The descriptions of 
the theories are presented with clarity in an easy and readable style, although it 1s evident that the 
author is most at home describing psychoanalytic theories and least happy with trait theories. As ts 
so often the case with American texts, Eysenck's work is only briefly mentioned. Technical 
terminology is explained in a glossary and carefully selected suggestions for further reading are 
provided for each theory. 

For the numerous generations of students raised on Hall & Lindzey, this book must have a 
familiar ring, and publication of the third edition of their classic work (Hall & Lindzey, 1978) has 
eclipsed most of the features of Endler's book. Both cover a similar range of theories and present 
these theories in a similar fashion. Even the final chapter on Zen is not novel, since Hall & Lindzey 
also have a chapter on Eastern psychology. However, the book does contain one unique feature: the 
inclusion of 40 exercises interspersed throughout the text, which assist the reader in applying the 
theories to his/her self-understanding and provide an alternative way of presenting some of the 
information. For example, we are invited to try out free-association, to attempt some items from the 
‘Chitling Test’ (an IQ test designed from the black culture's viewpoint — a sobering experience), to do 
a Rep Grid and a Q sort. While students under pressure might be tempted to skip them, they provide 
some useful ideas for teachers wanting to enliven lectures. The book as a whole, however, cannot be 
regarded as a source of enlivenment in the discussion of personality theories. 

SARAH E. HAMPSON 


HarL, C. S & LiNpzzv, G. (1978). Theories of 
Personality, 3rd ed. New York: Wiley. 


Clinical Diagnosis of Mental Disorders: A Handbook. Edited by B. B. Wolman. New York: Plenum 
Press. 1978. Pp. 921. Price not stated. 


This compendium is devoted to the procedures used by clinical psychologists when assessing people 
with ‘mental disorders’. It contains 25 chapters and half a million words. There are two parts: 
‘Diagnosis’ and ‘Differential diagnosis’. In the first and larger part, authors describe particular test 
procedures, such as the Rorschach, TAT, MMPI, Hutt Adaptation of the Bender-Gestalt, and 
WAIS. In the second, topics like brain impairment, mental deficiency, schizophrenia, old age and 
neurotic disorders form the basis for review. Authors are given a good deal of latitude as to what to 
include or leave out and the coverage varies from the thorough to the idiosyncratic. 

The chapters on specific tests are the most successful, although in most cases comprehensive 
reviews are already readily available. They tend to be limited to technical exposition rather than to 
critical consideration of indications and contra-indications for use. It would have been interesting, for 
example, to have seen a comparison of operational and interpretative uses of the Rorschach. The 
evidence by L. C. Wynne, M. Singer and their colleagues that reliable scores can be derived which 
differentiate highly between the parents of schizophrenic patients and parents of control subjects, and 
the counter-evidence of S. Hirsch and J. P. Leff who failed to replicate these findings, is surely 
relevant to diagnostic issues. 

It is probably inevitable that what is chiefly striking about a book of this kind is the gaps. There 1s 
no systematic account of differential symptom definitions, techniques of structured interviewing, the 
place in classification of putative causes (for example, amphetamine intoxication or temporal lobe 
epilepsy in association with schizophrenic symptoms), the evaluation of disorders of verbal and 
non-verbal language, the validation of a wide range of syndromes in terms of differential response to 
medication or psychological treatments, the large-scale international studies which have used 
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standardized methods of psychiatric examination, the semantic differential, or methods of social 
assessment. Perhaps the title of the book gives rise to different expectations in Britain than it would 
in the United States but it cannot be recommended here as a systematic and critical review of clinical 
diagnosis. 

J. K. WING 


Cognitive Behavior Therapy: Research and Application. Edited by J. P. Foreyt & D. P. Rathjen. New 
York: Plenum Press. 1978. Pp. 265. £11 93. 


Cognitive behaviour therapy is one of the more exciting and promising recent developments in 
psychological treatment. It is an umbrella term used to describe approaches which by means of both 
behavioural and cognitive treatment procedures aim to modify cognitive processes presumed to 
underly psychological dysfunction. Some interesting research has been conducted but the field is at an 
early stage of development in which enthusiastic clinical application ıs likely to move considerably 
beyond any empirically established basis. 

At such a time a thoughtful, critical evaluation of the current status of cognitive behaviour therapy, 
in terms of both the validity of its assumptions and its clinical effectiveness in a range of problem 
areas would be very welcome. Hoping to find this in the book edited by Foreyt & Rathjen, I was a 
little disappointed. The book opens well with a thoughtful chapter by Terence Wilson asking the 
question whether cognitive behaviour therapy (CBT) should be considered a paradigm shift or 
passing phase. He suggests that CBT is best considered as one aspect of a general trend within 
behaviour therapy to take greater account of cognitions as mediating variables, as exemplified by 
Bandura's social-learning theory. As such, he suggests it might be more profitable to view CBT. as 
evolution within mainstream behaviour therapy, rather than as a separate development. 

'The rest of the book is devoted to eight chapters of variable quality covering a range of 
applications of CBT. Novaco's chapter on anger and Turk's on pain succeed very well in their task. 
Having presented the cognitive conceptualization of these problems and the supporting evidence, they 
proceed to describe the treatment interventions which have been derived and present evidence 
evaluating the effectiveness of these procedures. A welcome feature in both cases is the detailed 
description of what is actually done in treatment. Rathjen, Rathjen & Hiniker in their chapter on a 
cognitive analysis of social performance present a somewhat uncritical compendium of cognitive 
approaches to conceptualizing and treating deficiencies 1n this area with relatively little consideration 
of research. Steger's chapter on CBT for sexual disorders suffers heavily from the lack of any work 
that has actually been done using distinctively CBT approaches to these problems, as does Gentry's 
chapter on somatic disorders. Burns & Beck present a largely clinical account of the cognitive model 
and treatment of mood disorders. This has been one of the main areas in which cognitive approaches 
have been developed, and in which research has been undertaken. The only research directly 
referenced in the chapter is an important study in which cognitive therapy was found to be superior 
to imipramine in the treatment of clinical depression Doyle & Bruhn use their chapter to present a 
clinical case report of a variant of covert sensitization, and in the final chapter Cameron shares with 
us his cognitively oriented conceptualization of why some patients cooperate with psychological 
treatment while others do not. 

Overall thus is a slightly disappointing volume. This to some extent reflects the undeveloped nature 
of the area and the book may be useful ın conveying some impression of the current state of cognitive 
behaviour therapy. However, I think it could have been done better. 

J. TEASDALE 


Medicine, Mind and Man. An Introduction to Psychology for Students of Medicine and Allied 
Professions. By John Cohen & John H. Clark. Reading: W. H. Freeman. 1979. Pp. xviii 4- 401. 


Since the Todd Report recommended including behavioural science in the British medical curriculum, 
the lack of a suitable text for introducing psychology to medical students has been a problem. 
Medicine, mind and man represents one attempted solution, but I would expect that if it were 
employed in use for the purpose, the book would compound rather than alleviate existing difficulties. 
Students would probably more often feel that they were being preached at and talked down to than 
that they were acquiring a reasonably clear conspectus of what is meant to be a basic scientific 
discipline. If they ended up with their initial uninformed prejudices about psychology confirmed 
rather than replaced by more informed ones, they could hardly be held to blame. 
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Probably the major shortcoming of the book is that it takes for granted much of what it might 
reasonably be thought to have been intended to explain. As a science, psychology is what 
psychologists do. But instead of giving a clear statement to the effect that psychologists understand 
the activity they are engaged in to be the scientific study of behaviour and the application of the 
results of this study ın practical settings, the authors take as their point of departure the admirable 
sentiment that ‘much of psychology might be summed up in the words: “be kind and considerate to 
your patient”’ (p. xiv). Their initial failure to distinguish between psychology as an attempted science 
and ‘psychology’ as a lay term for gentle Machiavellianism (as in the phrase ‘just use a bit of 
psychology") is compounded by (1) the authors' readiness to lumber aspiring doctors with gratuitous 
and platitudinous advice (‘A doctor should try to convey no more and no less than his patient can 
bear’, p. 11); (2) their reliance on schematic, idealized Platonic notions of doctor-patient 
relationships, which are passed off as apparent descriptions of reality (‘the patient 1s encouraged to 
speak and the doctor listens, without interruption...” (italics theirs) (p. 15). When was the last time 
either of these gentlemen attended a hospital out-patient clinic?), (3) their etymological rather than 
psychological conception of a ‘patient’ (pp. 9-10); (4) their failure to consider ‘doctor’ and 
‘patient’ within the framework of social roles in general and to acknowledge, in particular that there 
are many different types of doctor-patient relationship, depending on, inter alia, the nature and stage 
of the patient's problem, the social conditions in which the protagonists interact, the nature, duration 
and locus of any treatment which is to be administered, if indeed treatment, as opposed to advice, 1s 
involved; (5) their presentation of theoretical issues in psychology without much, if any, coverage of 
the phenomena to which the issues refer. 

A second major shortcoming is a great unevenness in the authors' coverage of topics. Road safety 
and accidents, suicide, handedness, sex and marriage, all warrant chapters, whereas behavioural 
treatments are allocated a bare two pages. They are presented as ' widely used by psychiatrists' (p. 
243). What about psychologists? The authors practically never mention psychologists as such and it 
is difficult to imagine that the student could derive any useful notion about what role they might play 
in clinical settings. 

A third major shortcoming is that the authors say virtually nothing about the methods 
psychologists use to gather their data and the conventions they follow in interpreting them 

The resulting compendium of anecdotes, prescriptions, tables, lists, figures, schemata, occasional 
allusions to empirical or experimental reports, and potted Freud is most disconcerting to try to read. 
I would not agree with the claim on the back cover that it is ‘written in a lucid and often witty style’; 
often it reads as though it had been written by medical students, under exam conditions, rather than 
for them. It seems unlikely in the extreme to contribute to the making of doctors who are kind and 
considerate to their patients. It might even put them off psychology altogether, which would be very 
sad. 

VICKY RIPPERE 


Attention and Information Processing in Schizophrenia. Edited by S. Matthysse, B. J. Spring & J 
Sugarman Oxford: Pergamon Press. 1979 Pp. 337. £40.00. ISBN 0 08 0231268. 


This book contains the proceedings of a conference held in 1976 which have also been published in 
the Journal of Psychiatric Research, 14, 1978. Although the idea that a deficit in 
information-processing underlies schizophrenia has been current for some time and has generated a 
great deal of research the precise nature of this deficit still remains unclear. Many of the reasons for 
this lack of progress are illustrated in this book. Indeed about half of the contributions do not report 
new work with schizophrenic patients, but discuss various methodological problems or approaches to 
the measurement of attention and information-processing that have not yet been applied to patients. 
This includes some useful discussion of the neuroanatomical basis of attention and the use of the 
EEG evoked potential as a measure of attention. 

In the last section of the book various requirements of the tasks used to test patients are discussed. 
The essence of these ‘methodological maxims’ is that the task should demonstrate a specific rather 
than a general abnormality in schizophrenic patients. Two important points are made by Spring & 
Zubin concerning this problem. First, a task should be found on which patient performance is in 
some sense better than normal. Almost any theory which predicts poor performance by patients on 
some task will be supported, but this may be due to the general deficits associated with discomfort, 
lack of cooperation and so on. Secondly, a task or condition should be included in which the 
schizophrenics are not impaired whereas some other psychiatric control group (c.g. depressed 
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patients) is. Only in these circumstances can we be certain that a specific rather than a general deficit 
has been associated with schizophrenia. 

This second point is related to Chapman & Chapman’s analysis of the measurement of differential 
deficit. In a typical experiment schizophrenics and controls will be compared in two conditions with 
and without distraction. The distraction is frequently found to have a greater effect on the 
schizophrenics. However, they usually perform worse than the controls even on the base line 
condition without distraction. Chapman & Chapman show (as they have elsewhere) that unless the 
tests are matched for ‘true score variance’ such results cannot be interpreted This can be seen as a 
problem of scaling. For example it might be the case that if ratios rather than differences were used 
to compare the conditions the differences between the groups would disappear. We cannot say which 
is the correct interpretation. Chapman & Chapman solve this problem by constructing very carefully 
matched tests. They have found that some, but not all schizophrenic deficits disappear when this is 
done. It might be worth adding an alternative solution where the control group is matched, not on 
somewhat arbitrary variables like age, sex and IQ, but on performance in one of the conditions of 
interest. This matching can usually be achieved by choosing some other abnormal group and 1s thus 
equivalent to Spring & Zubin’s solution. 

Disappointingly few of the studies reported in this book fulfil these rather stringent criteria for 
demonstrating deficits specific to schizophrenia. More disturbingly, many of them fail to fulfil some 
more basic requirements for research in schizophrenia. First there must be clear definitions of the 
patients being tested. It is not enough to say that two psychiatrists agreed on a diagnosis of 
schizophrenia. We need to know what sort of criteria they used. Ideally this can be achieved by using 
standard interviews and diagnostic procedures such as the Present State Examination or the Research 
Diagnostic Criteria. Many of the studies in this book do not use such procedures or give no 
indication what procedures were used. Once again Spring & Zubin highlight the problem. They 
initially selected patients on the basis of a hospital diagnosis of schizophrenia and then applied the 
Research Diagnostic Criteria. In terms of this system the patients were spread among a broad range 
of diagnostic categories 

A second basic requirement is that the tasks being used must have a firm basis in experimental 
psychology so that we know what functions are being measured and which are the important 
parameters of the task affecting performance. Ideally the task should also relate to one of the major 
symptoms of schizophrenia. 

These are just a few of the problems in research on schizophrenia that become: very much apparent 
when reading this book. However in the more interesting papers many of the problems have been 
overcome. 

Among the many studies of reaction time that of Wishner et al. stands out in that an attempt was 
made to study the well known 'slowness' of schizophrenic patients in terms of the 
information-processing stages of Sternberg. It seems very reasonable and helpful to ask the question 
in which stage does this slowing occur. The results show very clearly that although the schizophrenics 
are generally slower their rate of processing information does not differ in any of the stages studied. 
Of crucial importance is the revelation in a footnote that mentally subnormal subjects who are also 
generally slow show a slower rate at the stage of memory search. Thus the schizophrenic slowness 
must either relate to the stage of response organization or cuts across all stages, but is not a general 
phenomenon of impaired performance. 

Collins et al. report, as they have elsewhere, one of the few tasks in which schizophrenics do better 
than controls. It seems that patients can distinguish very minimally separated flashes of light which 
for controls are fused. At the moment this is a fascinating, but isolated, finding which needs 
desperately to be related to other aspects of information-processing and to the symptoms of 
schizophrenia. ' 

Finally Cohen and Rochester have investigated directly some of the peculiarities of language which 
give rise to 'thought disorder'. These studies are of interest because they attempt to understand one 
of the more important symptoms of schizophrenia. Both authors conclude that the problem is not a 
linguistic one as might be found in aphasia, but is a problem in using language to communicate. 
Cohen finds that schizophrenics fail to edit out irrelevant responses while Rochester finds that they 
fail to provide sufficient referents for the listener. Both findings generate interesting and testable 
hypotheses. 

This book will be a disappointment for those hoping to learn much that is new or exciting in the 
field of information-processing and schizophrenia. However, many of the problems of studying 
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abnormal behaviour are highlighted either explicitly or implicitly. If the resulting methodological 
maxims were heeded by workers in this field the next conference in this area should be extremely 
interesting. 

CHRIS FRITH 


The Environment at Work. By E. C. Poulton. Springfield, Ill.: Thomas. Pp. xiv +162. 


This book reviews briefly, in 150 pages, the effects of a wide range of working conditions upon work 
performance. It deals with each topic 1n a self-contained chapter and gives some idea of the optimum 
conditions for each variable. Most of the chapters deal with the physical work environment; such 
factors as lighting, heating, noise, pollution, wind pressure and vibration. The dimensions are 
described in terms of SI units and care has been taken to try to present the effects of good or poor 
conditions as clearly as possible. Nevertheless, some of the graphs are quite complex ones. Other 
chapters are connected with reactions to the psychological environment in which certain work has to 
be done; shift work, perceived danger, work overload and underload. These are less technical in their 
presentation since the variables are not themselves describable in terms of SI units. 

We are told that the book is meant for everyone who works, but in particular is ‘suited’ to the 
needs of trainees, both those training for management and the professions and those training to 
represent employees. It is thus not primarily meant to be a book for the ergonomist but for people 
who have to assess and argue about working conditions as part of their jobs. The text is easy enough 
to read, but it does tend to be didactic. Too often it leaves the reader asking ‘on what grounds is this 
conclusion drawn?' Unfortunately, insufficient detail is provided to allow us to answer, or even to 
find out the answer. To give one example from ch. 11 on work overload and underload, coronary 
heart attacks are dealt with on p. 131. We are told that there are reports that work overload leads to 
greater probability of a coronary, and that prospective studies are more satisfactory evidence of this 
than retrospective ones. We are not told what studies have actually been done, not the figures 
obtained from such studies. At the end of the section we are informed that ‘giving up smoking and 
not being overweight are probably at least as effective methods of preventing a coronary heart attack 
as 1s reducing the work load', but again with no reference to figures and studies done. Thus someone 
interested in this topic would not be able to put forward an argument based on factual data to 
oppose someone who believes that ‘hard work never hurt anyone’. It is also the case that when 
experiments are quoted they are not always attributed, let alone referenced. In the chapter on 
perceived danger we are given some detail about an interesting experiment on repair of radios under 
apparent threat of air attack; but not who did the work, where or when. 

In fact, apart from acknowledgements for graphs and tables, the only references provided in the 
book consist of some suggestions for further reading — and these are very limited in number. One 
purpose of a book like this must surely be to interest readers in a field that they had not thought 
much about or had thought more simple or *obvious' than is in fact the case. I think that the book is 
likely to arouse just such interest; but it is going to leave a large number of frustrated readers with 
unanswered questions. 

SHEILA CHOWN 


Childhood Psychopathology: A Developmental Approach. By I. J. Knopf. Englewood Cliffs, N.J.: 
Prentice-Hall. 1979. Pp. 497. £11.65. 


As a text for courses that include the topic of children's abnormal behaviour, there is no doubt about 
this book's usefulness. It would be ideal, however, as the major reference for a course specifically 
covering childhood psychological disorders, though such courses are currently rare. The author has 
made a reasonably successful attempt to combine the area of child psychopathology with the major 
issues in developmental psychology and has achieved very broad and quite detailed coverage. Perhaps 
most important is the fact that the author does not presuppose in the reader a knowledge of general 
abnormal psychology and therefore provides relevant basic information. The book as a whole is 
clearly written, easy to read and understand. Basic terms are well defined when first introduced and a 
comprehensive glossary is provided. The information is up-to-date and very well organized, with each 
chapter including a detailed summary and many references. À comprehensive index, by author and 
subject, makes reference to specific topics very easy. À balanced and unbiased approach has been 
achieved in that all the major theories and alternative treatment approaches are fairly described with 
reasonable criticism where appropriate. Perhaps the most attractive feature of the book, however, is 
that the real flavour of this area is provided by the use of many vivid and interesting case uc 
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descriptions. In this way, even the reader who has never been confronted with the problems of 
childhood abnormality, is vividly presented with the basic raw material. 

The book is organized into four parts and follows a standard pattern for texts of abnormal 
psychology. It begins with a discussion of general issues and theories before moving on to describe 
specific psychological disorders, concluding with a discussion of the problems of prevention. Part One 
contains a concise history of the field, followed by a lucid account of the problem of defining the 
concept of abnormality. A lengthy description of alternative classificatory schemes is provided, and 
although the approach is in general critical, my impression 1s that the author is somewhat 
over-optimistic about their usefulness and neglects to stress important disadvantages. Nevertheless, a 
very clear idea is provided of the magnitude of the problems to be considered. Chapter 4 is concerned 
with a very general overview of normal personality development. Although relatively brief and 
somewhat uncritical, there is reasonable discussion of topical issues such as the interactive nature of 
the mother-infant interaction. The first section concludes with a detailed description of models of 
psychopathology which includes much useful background information, such as basic genetics and 
neurophysiology. 

Part Two contains two chapters which provide a very detailed and broad description of assessment 
procedures and methods of treatment. Together they amount to a very useful and practical 
introduction to the clinical process. The author is sensitive to the many problems involved in 
assessment and therapy, and illustrates his points by good use of case material. The chapter on 
assessment covers in some detail all the major methods ranging from the clinical interview and 
observational methods to objective testing. Chapter 6 provides a comprehensive review of therapy 
methods including types of individual and group psychotherapy, behaviour therapy, somatic 
treatment and milieu therapy. The author provides a vivid account of the state of the art, the intrinsic 
complexities of the treatment process and the difficulties in research. 

Part Three is concerned with a quite elaborate discussion of specific syndromes which are generally 
organized in age order. Each problem is well described before considering aetiological factors and 
treatment methods. Beginning with eating, sleeping and elimination problems, the author moves on 
to a good discussion of the childhood psychoses, educational disorders, mental retardation, the 
psychoneurotic and psychophysiological disorders and the problems of adolescence, including 
delinquency, sexual disorders and drug abuse. 

The final brief section of the book looks towards the future by considering what might be achieved 
in terms of prevention of the various problems. 

Overall, this is a very good introductory text which should be extremely useful for teaching 
purposes. The only major criticism 1s that the author tends to neglect the British literature Apart 
from this, the book is extremely readable and informative. It contains most of the material that 
should be included in a work of this kind and combines it with an approach that is human, warm 
and sensitive. In general it manages to convey to the reader a clear feeling of what is involved in this 
clinical field. The colour of the area is not lost by the potentially cold description of facts, theories 
and methods. 

HILTON DAVIS 


Emerging Strategies in Social Psychological Research. Edited by G. P. Ginsburg. Chichester: Wiley. 
1979. Pp. 319. 


This collection of 11 chapters is based partly on the BPS workshop on ' New developments in social 
psychological methods' held in Oxford in 1975, and represents a testament to the achievements of 
what, with apologies where appropriate, might be characterized as the Oxford-Reno school of social 
psychology. It is a varied collection in many ways. In terms of overall quality, it is something of a 
curate's egg. Some parts of it are excellent. Other parts, if not exactly rotten, lack freshness and could 
profitably have been reduced for quick clearance. In spite of its origins. it cannot simply be viewed as 
a handbook of research methods. Useful hints on makes of film projectors (in the chapter by 
Kendon) or on computer programs for multidimensional scaling analysis (in the most informative 
chapter by Forgas) sit uneasily alongside other chapters which make no real pretence at giving 
explicit instructions in methodological procedures. I felt a little sorry for Collett who obviously took 
great care over his clear exposition of (by now fairly familiar) repertory grid procedures, unselfishly 
confining himself to the odd page of theoretical criticism, while others allowed themselves much more 
extravagant flights of fancy. 

‘Extravagant’ certainly seems an appropriate epithet for the technique of negotiated autobiography 
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construction described by de Waele & Harré. This chapter should certainly interest those who feel 
that the collection of personal accounts is potentially as important for social psychology as it has 
become for sociology. On the other hand, the specific method proposed seems to involve so many 
researchers for a single subject, and so much time, that I am sceptical how convincingly it could ever 
be defended as cost-effective. 

Among the other chapters, those by Clarke on the use of linguistic analogies for conceptualizing 
sequential behaviour, and particularly by Scaife on observing infant social development, combine 
perhaps the happiest blends of critical theoretical and methodological comment. Argyle's chapter 1s 
an uncomplicated plea for more attention to the sequential nature of social interaction (which 
remains one of the major themes of the book) and he seems unafraid of making a number of 
surprising generalizations. For instance: ‘Social psychologists have paid little attention to the ways in 
which social behaviour is affected by the situations in which it takes place' (p. 11). Laboratory and 
field-experimental manipulation of situations presumably doesn't count for purposes of this argument 
but one wonders exactly why not. 

The neglect of field-experimental methods is one of the most unfortunate omissions of this volume. 
The only real comment on them is contained in a single paragraph (p. 134) of Ginsburg's chapter on 
role-playing techniques, where he lets it be known that he regards them as 'deceptive', and argues 
(not necessarily congruently with, e.g. Argyle) that ‘the natural setting is difficult to control; and it is 
not necessarily the optimum location for the investigation of situated actions’. As for other 
experimental techniques, one looks in vain for any differentiated or constructive analysis. 

Ginsburg's chapter is followed by a rerun by Mixon of his critique of Milgram's obedience 
experiments, in which he yet again avoids discussing the critical concept of perceived responsibility 
Theirs are probably the most polemical of the chapters. They are not alone, however, in their 
tendency to slip into a style of writing which suggests, perhaps unintentionally, the offensively 
patronizing implication that anyone who might be disposed to look for merit in methods or theories 
that are not part of their party line is at best an ignoramus in the philosophy of science and at worst 
a benighted behaviourist engaged in a futile search for universal causal axioms. 

While reading such arguments I was struck by the contrast in style with a recent short paper by 
Gustav Jahoda (‘A cross-cultural perspective on experimental social psychology’, Personality and 
Social Psychology Bulletin (1979), 5, 142-148) which presents the case against ‘universalism’ in social 
psychology calmly and cogently. His argument is that experimental social psychology has generally 
been concerned with behaviour in ‘free social space’, where culturally defined roles and rules are 
relatively flexible; such flexibility simply does not obtain in many traditional cultures. The 
approaches described in this volume have great potential for describing the social contexts of roles 
and rules applicable to our own culture — contexts to which many social psychologists have given 
insufficient attention. Nonetheless, as the various contributors would all admit, there is room for 
choice and decision by individuals and groups within such contexts. How such choices and decisions 
are made is the central question with which most mainstream ‘cognitive’ social psychology has been 
concerned, yet it is barely touched upon directly in this volume. One is dealing here surely with 
complementary emphases and areas of competence rather than, as is apparently implied with a clash 
between mutually incompatible intellectual systems. 

J. RICHARD EISER 


The Selected Writings of A. R. Luria. Edited by M. Cole. New York: M. E. Sharpe. 1978. Pp. 351. 
$22.50. 


This book contains 13 articles which have been selected by Michael Cole ‘to show Luria’s efforts to 
use experimental clinical psychology as a basic tool for psychological analysis' but in no way 1s there 
an attempt, as he states in the introduction, ‘to cover depth or breadth’. The volume is divided into 
three parts. 

The first part ‘Early beginnings’ contains only one article ‘Psychoanalysis as a system of monistic 
psychology’. This paper dates from the early developments of Soviet psychology which, according to 
Luria needed radical reworking in terms of scientific method and dialectic materialism. He was aware 
that psychoanalysis proceeded from fundamental postulates which were vastly different from those of 
experimental psychology; and, that the psychoanalytical interpretation of the individual personality 
traits was often close to & philosophical idealism. Nevertheless, he saw in psychoanalysis the first 
attempt to construct a theory not on the basis of subjective mental qualities but on an organic basis. 
For this reason he regarded psychoanalysis as a system upon which it was possible to construct 
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models consistent with a materialist Marxist doctrine which views mental life as one of the various 
kinds of organic phenomena. 

The second part contains six essays on developmental psychology, most of them published between 
the years 1925 and 1930 representing Luria's early interest in child psychology when he was strongly 
influenced by the writings of Vygotsky. The first paper discussed the role of social environment on 
speech. He used the method of free association but the interpretation of his findings is, however, far 
distant from psychoanalytical theory relying more heavily on Vygotsky's major tenet, namely that 
language 1s the product of socio-historical circumstances and the form of interaction between child 
and adult. There are also in this section reports on the intellectual development of children which, as 
the editor aptly comments, show a strong influence of Western psychologists such as Piaget, Bühler 
and Stern. There is a particularly interesting paper entitled ‘the development of writing in the child’. 
In this study Luria attempts to demonstrate that writing, as a culturally determined form of 
communication, 1s preceded by scribbling and pictorial representations. These graphic activities are 
used by the children, before reaching school age, as an attempt to represent past events in the 
present. 

The last paper in this section, ‘The formation of voluntary movements in children of pre-school 
age' has not been written by Luria but by one of his students (O. K. Tikhomirov). The investigation 
is centred on the role of the child's own external speech in regulating his motor reactions and 
attempts to demonstrate that the specific stimulus that evokes voluntary movement 1s speech. The 
question of whether speech precedes and guides the motor response or whether it only accompanies 
action was the mainspring of a great deal of work carried out later on by Luria on the role of speech 
in the development of mental processes. 

The last part of the volume contains six papers on neuropsychology concerning a variety of topics; 
some deal with general issues, for instance the problem of functional localization and the reliability of 
psychological investigation. In other articles Luria describes specific research projects such as the 
disturbances of intellectual function in patients with frontal lobe lesions, memory disorders in 
diencephalic tumours and the mechanisms of eye movements ın normal and pathological vision. 
These papers, with the exception of one, cover work published between the years 1961-1969. 

The volume has the merit of containing a selection of the lesser known contributions of Professor 
Luria all of which have appeared in Russian journals and books not easily available in this country. 
There is also a full text of his address to the International Congress of Psychology in London in 1969 
which has been unobtainable. 

MARIA A. WYKE 


Understanding Infancy. By E. Willemsen. San Francisco: Freeman. 1979. Pp. xiii -- 331. Cased, £9.50; 
paper, £4.40. 


Infancy is probably the most intensely studied period of the human life cycle. A visit to an American 
conference on developmental psychology or a glance at the key journals will indicate that anything 
from a quarter to a half of the contributions are concerned with infancy. An up-to-date review of this 
material — most of which has appeared in the last decade — would be desirable indeed and that is what 
Eleanor Willemsen has attempted. The results are rather disappointing. In her wonderful book on 
French cooking, Julia Child warns against the use of the electric liquidizer because it reduces a soup 
to a monotonous pap Willemsen, on the other hand, appears to have a passion for liquidizing. The 
excitement of research with 1nfants has always been that it offered the opportunity to establish 
something fundamental about the human mind. Whether it is established that newborns can 
spontaneously imitate or that they have little conception of the third dimension, a long-standing 
question about the origins of human knowledge is being tackled. Yet in Willemsen's book, these 
intellectual issues are handled in the same bland tone as the question of how to soothe a crying baby. 

The book does offer broad coverage. There are chapters on prenatal development, the capacities of 
the newborn, learning and perception, cognitive development, social development and personality, 
together with an appendix on research methods. However, in all of these chapters, the emphasis is on 
a review of the results rather than on any new integration of theoretical controversy. This often 
means that topics are discussed in proportion to the amount of research that they have generated, 
whether or not the research is of any theoretical consequence. For example, a good deal of attention 
is given to the dreary issue of whether infants prefer complex stimuli (although complexity has never 
received a convincing theoretical definition) but recent work on colour vision and speech perception 
during infancy is given only a very brief treatment. 
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Because of the specialized nature of the subject matter, it is obviously the kind of book which, in 
Britain at least, would potentially be useful for an advanced optional course in developmental 
psychology, especially since there are few alternatives in the market. Its main rivals are the trio of 
books produced by T. G. R. Bower in the last 5 years. A comparison with Bower’s books 1s therefore 
of some importance. The contrast is very clear. The last thing that Bower can be accused of is 
monotony. No doubt, he sometimes alarms the more cautious reader, but I cannot imagine many 
students genuinely preferring Willemsen’s book. 

The book’s authorship 1s something of a puzzle. Eleanor Willemsen’s name appears on the outside 
cover but the preface indicates that Louise Nicholson ‘rewrote the entire manuscript to clarify the 
diction and added explanatory material, such as the examples’. My guess is that this rewriting is part 
of the explanation for the book’s blandness. Nor is such stringent editing exceptional. American 
textbooks are fast becoming as packaged and homogenized as take-away foods. It’s not just liquidizers 
that spoil the broth. 

PAUL L. HARRIS 


| THE BRITISH 
JOURNAL OF CRIMINOLOGY 


Editors: 
T. C. N. GIBBONS, ROGER HOOD, GORDON TRASLER, NIGEL WALKER, 
J. E. HALL WILLIAMS 


Published on behalf of the Institute for the Study and Treatment of Delinquency 
The purpose and policy of the British or deviant persons. Finally, the Books 
.Journal of Criminology 1s to foster and and Periodicals section provides a 
publish research into the causes and comprehensive review of all significant 
prevention of crime, and the treatment of literature in the field. 
; Articles from recentissues have 
Each volume of the Journal contains included: 
original papers. There are in-depth *The New Female Criminal: Realty or 
notes, commenta on practical Myth "Identifying, Explaining and 
developments, notably in the field of Predicting Deterrence *Masculinity and 
penal reform, on government or Delinquency Revisited *Amnesty - A 
privately sponsored reports, and on new Quasi Experiment *Personality es 
experiments in dealing with delinquent among Prisoners and Prison Officers 


1980 Annual Subscription: £18.00 post free ISSN 0007 6350 


Please write to: 
The Subsonptons Dept., Sweet & Maxwell Ltd., 
North Way, Andover, Hants. 8P10 SBE. 


(S)Stevens 





REVUE DE PSYCHOLOGIE 
APPLIQUEE 


Publication Trimestrielle 


Directeurs: Dr P. PICHOT ET R. LEPEZ 
3e Trimestre 1979 — Volume 29 N° 3 








SOMMAIRE 


F. BoLLER et H. HECAEN - L'évaluation des fonctions neuropsycho- 
logiques: examen standard de l'unité de recherches neuropsychologiques 
et neurolinguistiques (U. IIT) I.N.S.E.R.M. 


M. MiukovircH - Le système inclusif de cotation du Rorschach par 
Exner. 


L. LuncaT - La reproduction graphique de trajectoires. Problémes posés 
par la symétrie. 


HERMAN A. WITKIN - Portrait, notice biographique et bibliographique. 








Rédaction et Administration: 48, Avenue Victor HuGo — 75783 PARIS 
CEDEX 16 


Abonnements (un an) France: 70 F — Etranger: 85 F 
Le numéro France: 15 F ~ Etranger: 15 F 


Réglements au nom des EDITIONS DU CENTRE DE PSYCHOLOGIE APPLIQUEE 


Numéro spécimen sur demande 





Brit. Jnl. of Psychology, 71, 1 () 


Distribution-free methods for non-parametric problems: 
A classified and selected bibliography 


Compiled by Bernard Singer (University of Reading) 


* A guide to primary source material drawn from publications in psychology 
and related fields, such as sociology, medicine and biology, as well as 
from the mathematical and statistical literature. 

* Sources grouped under pragmatic headings with easy to follow cross- 
reference system. 

* Main coverage post-1961 to 1978 but early classics and late additions 
appended. 

* Comprehensive author index. 


Publication: September 1979 Price: £5.50 (US$18.00) Pp. 72 
ISBN 0 901715 10 7 


Orders to The British Psychological Society, The Distribution Centre, 
Blackhorse Road, Letchworth, Herts SG6 1HN, UK. 





Suggestions to contributors 


Completely revised and rewritten by the Standing Committee on Publications, this 
document sets out to guide the inexperienced through the often confusing 
processes of preparing and submitting scientific papers. 


* detailed guidance on manuscript preparation, from abstract to references 


* suggestions for preparing tables and figures and setting out complex 
mathematics 


advice on copyright and how to obtain permission to use already published 
work 


* how to cope with headings and abbreviations 


Price to BPS Members £1.00 Price to Non-Members £2.00 


Available from The British Psychological Society 
St Andrews House 
48 Princess Road East 
Leicester LE1 7DR 


(u) 





ARTHUR R JENSEN 


., BIAS IN MENTAL TESTING 


Anyone claiming that the wisely used standard IQ tests were ‘grossly 

standardized tests of mental ability biased against virtually everyone 

—1Q, scholastic aptitude, and but the white middle-class’. Now, in 

achievement tests — are culturally the most detailed and thoroughly 

biased will have to contend with this researched book ever published on 

book. this topic, Professor Jensen examines 
chorretric methods for 

Such tests have been criticized by objectively detecting bias in mental 

psychologists and journalists, and testing and for applying 

widely considered unfair to racial standardized tests fairly in education, 

minority groups. Twenty-five years personnel placement, and other 

ago, Arthur Jensen too believed that areas. ; 


800 pages Hb 0 416 83230 X £15.00 


Puce netin the UK only 


METHUEN 


ARCHIVIO DI PSICOLOGIA NEUROLOGIA E PSICHIATRIA 


rivista trimestrale pubblicata a cura dell'Università Cattolica del Sacro Cuore 
direzione Leonardo Ancona 
PSICOLOGIA : direttore Giuseppe Girotti; comitato di redaxtone.Luigi anolih, Anna Maria Pati, Eu Scabini 
NEUROLOGIA: direttore Giorgio Macchi; comitato di redazione Paolo Li, beret Gainotti, Tonali 
PAICHIATRIA: direttore Leonardo Ancona; ‘comitato di redazione Filippc F ovanni Guerra, Corrado Pontalti 
segretario di redazione Carlo Saraceni 


Anno XL Gennalo-Marzo 1979 Fasc. l 
SOMMARIO 


ARTICOLI 


G. GarNOTTI - G. Miœu - M. C. SiuveRi — C. MASULLO, La disintegrarione semantico-lezsicale nef afasia’ correlazione tra le 
varie modalità di espressione e di ricezione linguistica 

P. L. BALDL Un test di memoria a lungo temne M.L.T.T1"): costruzione e -aratura preliminare su adulti normall 

A DAD — A. S. Bomar Da dove di? La genen del nesso tra denaro e lavoro nel bambino 

F. CALAMONERI — F. GUZZETTA. Sem M ites dello schema corporeo in «tá evolutiva: il test di Daurat-Hmeljak, Stambak e 


Bergés 
M C. AUTERI-SAPIENZA — G. MENDORLA — S. CASTORINA, II vissuto deli" ine dal corpo nell adolescente 
L. S1ENTELLA — M. Beatin — M. Caaosst, Aspetti psicosomatici in pazienti Dette da cancro della mammella 


isaac Covent NOTIZIARIO 
e r * 
RECENSIONI 

L. ANCONA. I processi biologici dal apprendimento (A. M. Pati) 

L. ANOLLI — V. CIGOLI. Lo sviluppo della percezione visiva. Contributo alla tazatura italiana del DTVP di M. Frostig e analisi del 
Programma Frostig (V. Ugazio) 

J. M. BENOIT — A. AGNON - M. SCHNEIDER — C. CALLIGARIS — C. vor: BRAUN - J, MAKTI — G. VIKAR - J, RAICHMAN ~ 
C. KizEx. Critica e storia dell'istituzione psicoanalitica (A. M. Pati) 

K. Jaspers. Filosofia (M. Bellotto) 

G. NOEL ~ M. COLLARD. Tomodensitométrle cérébrale (G. F. Rossi) 


Editoriale 


' Archivio di Psicologia, N Wee a il pabellon Ia rig 1 trimestrali, par 576 576 pp. an anual: 
Corrispondenze, manoscritti, pubb. debbono esssre indirizzati 
geriet varii égeus Puis nu: [3023 MINOS 
Prezzo del presente fascicolo: L, 5.000 l'Itail; L. 6.500 per l'Estero 
ABBONAMENTO ANNUO: L, 15.000 per i'Italia; L 23:000 per l'Estero 





iii) 


NOTES FOR CONTRIBUTORS 


1. The Editorial Board of The British Journal of 
Psychology, although giving preference to reports 
of empirical studies likely to bear upon our under- 
standing of general psychology, is also ready to 
consider the publication of review papers. 

It may be possible occasionally to give early 
publication to very short articles, not exceeding two 
printed pages in length (about 1000 words). As 
a rule such articles need not have a summary or 
titled headings. 


2. The circulation of the Journal is world-wide. 
There is no restriction to British authors; papers 

are invited and encouraged from authors throughout 
the world. 


3. Papers should be as short as is consistent with 
clear presentation of the subject matter; in general 
they should not exceed about 7000 words. A 
summary of up to 200 words should be provided. 
The title should indicate exactly but as priefly as 
possible the subject of the article. Papers will be 
evaluated by the Editor and referees in terms of 
their theoretical interest, practical interest, relevance 
to the Journal, and readability. 


4. Publication is speeded by care in preparation. 

(a) Authors are asked to submit a separate front 
page reporting the title of the article and their 
name(s) and affiliations. The author(s) name(s) 
should not then appear as such within the text. 

(6) Contributions should be typed in double spacing 
with wide margins and only on one side of each 
sheet. Sheets should be numbered. The top copy 
and at least one carbon copy should be sub- 
mitted and a copy should be retained by the 
author. 

(c) Tables should be typed in double spacing on 
separate sheets. Each should have a self- 
explanatory title and should be comprehensible 
without reference to the text. They should be 
referred to in the text by arabic numerals. Data 
given should be checked for accuracy and must 
agree with mentions in the text. 

(d) Figures, i.e. diagrams, graphs or other 
illustrations, should be on separate sheets, 
numbered sequentially ‘Fig. 1’, etc., and each 
identified on the back with the author's name 
and the title of the paper. They should be 

carefully drawn, larger than their intended size, 
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Mischel and the concept of personality 


Michael W. Eysenck and Hans J. Eysenck 





The various criticisms that Mischel has made of the state-trait approach to personality are considered 
and found to be lacking in substance. His major argument is that the actual inconsistency of 
behaviour is incompatible with the expectation of behavioural consistency that follows from the 
state-trait approach. However, Mischel has misread the evidence, and pays insufficient attention to 
the distinction between consistency at the intervening-variable level and consistency at the 
behavioural level. In addition, Mischel and others have evaluated state-trait theories from a rather 
narrow perspective and thus have failed to appreciate the substantial contribution made by such 
theories. It is concluded that personality forms an indispensable part of experimental and applied 
psychology, and that Mischel's criticisms have unfortunately tended to accentuate the schism between 
personality and experimenta] psychologists. 





Over the past decade, there has been increasing criticism of the state-trait approach to 
personality. While doubts had been expressed previously, for example by Vernon (1964), it 
was the publication of a book by Mischel (1968) that provided the impetus for much of the 
subsequent debate. For purposes of expositive clarity, it will be assumed that state-trait 
theorists (e.g. R. B. Cattell, H. J. Eysenck, J. P. Guilford) share the following 
preconceptions about the most appropriate approach to theorizing in the field of 
personality: 

(1) Individuals differ with respect to their location on important semi-permanent 
personality dispositions, known as ‘traits’. 

(2) Personality traits can be identified by means of correlational (factor-analytic) studies. 

(3) Personality traits are importantly determined by hereditary factors. 

(4) Personality traits are measurable by means of questionnaire data. 

(5) The interactive influence of traits and situations produces transient internal 
conditions, known as ‘states’. 

(6) Personality states are measurable by means of questionnaire data. 

(7) Traits and states are intervening variables or mediating variables that are useful in 
explaining individual differences in behaviour to the extent that they are incorporated into 
an appropriate theoretical framework. 

(8) The relationship between traits or states and behaviour is typically indirect, being 
affected or ‘moderated’ by the interactions that exist among traits, states, and other salient 
factors. 


The Thorndike-Mischel critique: Behavioural consistency 


Theories of this kind, be they trait or type theories, have been most forcefully criticized by 
Thorndike (1903), who held that ‘there are no broad, general traits of personality, no 
general and consistent forms of conduct which, if they existed, would make for consistency 
of behaviour and stability of personality, but only independent and specific stimulus— 
response bonds or habits’ (p. 29).* This doctrine of ‘Sarbondism’, as McDougall used to 

* Other typical statements of early situationism from Thorndike (1903) are the following: ‘The striking thing 1s 
the comparative independence of different mental functtons even where to the abstract psychological thinker they 
have seemed nearly identical There are no few elemental faculties or powers which pervade each a great number 
of mental traits so as to relate them closely together’ (p. 28). And again ‘The mind must be regarded not as a 


functional unit nor even as a collection of a few general faculties which work irrespective of particular material, 
but rather as a multitude of functions each of which is related closely to only a few of its fellows, to others with 


greater and greater degrees of remoteness and to many to so slight a degree as to elude measurement' (p 29). p == 
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refer to it, with its attending notion of the equipotentiality of the CS, has by now more or 
less disappeared from psychology, and does not therefore require an extended answer; it 
may, however, be useful to point out that even within ‘Sarbondism’ consistency of 
behaviour and personality is by no means ruled out. It is not difficult to envisage 
conditions of life which would favour the production of consistent sets of S-R bonds which 
might give rise to certain traits; thus soldiers in a Guards regiment would be subjected to 
many conditions in which tidiness would be rewarded, and untidiness punished. This 
should, even on Thorndike's own grounds, give rise to consistently tidy behaviour, or a 
trait of ‘tidiness’. In more modern terms, a consistent history of reinforcement should be 
able to create consistent forms of behaviour, and persistent behavioural traits and types. 

This division of opinion regarding consistency of conduct gave rise to many 
experiments in the 20s and 30s; these have been reviewed by H. J. Eysenck (1970) in some 
detail. He concluded that these studies gave unambiguous evidence of consistency of 
behaviour, even when, as in the case of the large-scale work on honesty, deceit, self-control 
and organization of character (Hartshorne & May, 1928, 1929; Hartshorne & Shuttleworth, 
1930), the original authors drew an opposite conclusion from their data. H. J. Eysenck also 
discussed 1n detail the applicability of many of the criticisms later made of the concept of 
consistency, and showed them to be largely mistaken. 

More recently, Mischel (1969) has taken up the argument, suggesting that while trait 
theory predicts behavioural consistency, it is behavioural inconsistency that is typically 
observed. He writes: ‘I am more and more convinced, however, hopefully by data as well 
as on theoretical grounds, that the observed inconsistency so regularly found in studies of 
noncognitive personality dimensions often reflects the state of nature and not merely the 
noise of measurement’ (p. 1014). The basis for this assertion was the partial review of the 
relevant literature by Mischel (1968), who concluded that measures of consistency in 
personality rarely produce correlations as high as 0-30. 

Mischel's argument is subject to the same criticisms as Thorndike's, and these will now 
be presented very briefly; a more extended discussion of the evidence supporting these 
criticisms, together with a review of much of the empirical evidence, is presented by H. J. 
Eysenck (1970). We should note, however, that while Thorndike wrote at a time when the 
evidence was ambiguous, and too fragmentary to allow of any certain conclusions as 
regards the consistency of conduct, the evidence is by now so voluminous, and so strong 
and unambiguous, that it is curious that Mischel's doctrines should have attracted as much 
attention as they have. Boring would no doubt have explained this fact by appealing to the 
Zeitgeist, which floats like a disembodied spirit above his History of Experimental 
Psychology; we put forward no hypothesis in this connection. 

At the empirical level, an inadequacy of many studies has been the use of very limited 
and unreliable data sampling. The difference that enlarging the data base can make to 
correlational measures of consistency was demonstrated clearly by Epstein (1977). Subjects 
kept records of their most positive and negative emotional experience each day for over 3 
weeks. The mean correlation when either positive or negative experiences were compared 
on only 2 days was less than -- 0:20, and very much in line with the magnitude of most of 
the correlations discussed by Mischel (1968). However, when the mean for all the odd days 
was correlated with the mean for all the even days across subjects, the mean correlation for 
the pleasant emotions was +0-88, and was only slightly less for the unpleasant emotions. 

The above findings are, of course, based entirely on self-report data. However, Epstein 
(1977) also discussed observations made daily by external judges for 4 weeks on eight 
variables related to sociability and impulsivity. The mean correlation based on two 1-day 
samples of behaviour was -- 0:37, versus +0-81 for two 14-day samples, and the highest 
reliability coefficients were produced by those variables requiring the least inference. 
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One of the problematical aspects of the Mischel critique is that he sometimes seems to 
imply that the putative consistency of personality can be effectively discredited by reference 
to the situational specificity of behaviour. For example, Mischel (19735) argued that, 
‘People may proceed quickly beyond the observation of some consistency which does exist 
in behaviour to the attribution of greater perceived consistencies which they construct' 
(pp. 341-342). The implication that the only place to look for-consistency is in overt 
behaviour is surely erroneous. Since both trait and state concepts are intervening 
variables, one must distinguish between consistency at the mediating level of states and 
traits, and consistency at the level of specific behavioural responses. It would be 
unreasonable to deny the possibility that specific behavioural inconsistency may coexist 
with a more conspicuous consistency at the mediating level. 

In essence, the data suggest that reasonably high consistency at the intervening-variable 
level is accompanied by apparently inconsistent and situation-specific behaviour Block 
(1977) evaluated the three main kinds of personality data: objective test behaviour, 
self-report, and rating. He concluded that self-report and rating data are often reliable and 
also comparable, but that objective test data tend to be unreliable and inconsistent. 
Mischel's evidence of low reliability coefficients centred, of course, on objective test 
responses, Even here recent studies, and the proper evaluation of earlier studies such as 
those of Hartshorne & May, give evidence of impressive consistency. 

Mischel’s criticism leaves out of account the simple fact that complex traits (e.g. 
*honesty") cannot meaningfully be measured by a single, simple behavioural test. As the 
Hartshorne & May studies have shown, intercorrelations between such simple tests are 
only +0-2 or thereabouts, giving negligible prediction of actual behaviour as rated by 
teachers; when a battery of nine such behavioural tests is used, however, it has 
considerable reliability, and correlations with outside, real-life criteria are between +0-5 
and +0-6. Thus even behavioural data, when properly used, can give strong evidence of 
consistency; inappropriate usage, of course, should not be accepted as evidence against 
consistency. 

It is interesting that Mischel (1977) has now accepted that ratings by observers and 
self-ratings can both show impressive reliability and consistency over time. However, the 
proper interpretation of these findings is in dispute. Mischel (1968, 1977) argued that the 
perception of personal consistency in ourselves and others involved the imposition of order, 
and that this served the function of reducing the otherwise unmanageable complexity of the 
actual situational specificity of behaviour. Mischel (1968) expressed the argument in the 
following way: ‘The conviction that highly generalized traits do exist may reflect in part 
(but not entirely) behavioural consistencies that are constructed by observers, rather than 
actual consistency in the subject's behaviour' (p. 43). Finally, Mischel implied that the 
observation of actual behaviour provided the basis for an objective approach to the study 
of personality. 


Mischel’s critique: Some counter-arguments 

(1) One of the best-known of Mischel's criticisms of the state-trait approach is his assertion 
that measures of consistency in personality rarely produce correlations in excess of -- 0-30. 
This criticism is applicable at most to studies considering specific behavioural responses 
across two dissimilar situations. As we have seen in the work of Epstein (1977), rehability 
coefficients greater than -- 0-80 can be obtained in self-report and rating data. 

(2) Mischel has frequently argued that traits are constructs which are inferred from 
behaviour, implying that the concrete behaviour which is observed is somehow objective. It 
must be doubted whether any straightforward distinction between the objective nature of 
behavioural facts and the subjective way we interpret them 1s justified. Experimenters 
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invariably use implicit or explicit theoretical notions to define the particular 
response-equivalence classes that are to be used in data collection. For example, Skinner 
(1938) constructed a single-response class, with all responses of sufficient strength to 
depress the lever being considered as equivalent, and all other responses being ignored. It is 
a matter of opinion whether the theoretically based selectivity of observation and 
utilization of a limited number of arbitrarily chosen response-equivalence classes should be 
construed as objective in any important sense. 

(3) The issue concerning response classes is also relevant to Mischel’s position in a 
rather different way. It is a plausible assumption that individuals will appear more 
inconsistent, the more specific are the response-equivalence classes used. Skinner (1938) 
obtained considerable response consistency and predictability using lever depression as a 
response-equivalence class. If, for example, the pressure applied to the lever had been used 
to divide lever presses into several smaller response-equivalence classes, then it is likely that 
most of this predictability would have vanished. Since response-equivalence classes are 
theoretically defined, apparent behavioural inconsistencies may be replaced by 
predictability when there is some theoretical understanding of the most appropriate 
response categories. 

(4) Mischel (19734) argued that traits are constructed from global overgeneralizations 
based on behaviour. He has not, apparently, considered the possibility that hereditary 
factors might be of importance. This is especially puzzling in view of the fact that the 
evidence from twin studies consistently indicates the substantial part played by heredity in 
the determination of personality. Shields (1962) carried out one of the most thorough 
investigations, and his study had the advantage of including monozygotic twins brought up 
apart. He used a fore-runner of the Maudsley Personality Inventory, and, for the 
extraversion scale, obtained intra-pair correlations of +0 61 for monozygotic twins reared 
apart, +0-42 for monozygotic twins reared together, and of —0:17 for dizygotic twins 
reared together. There was a similar pattern for neuroticism, with the correlations being 
4-0-53 for monozygotic twins reared apart, +0-38 for monozygotic twins reared together, 
and +0-11 for dizygotic twins reared together. Although the low correlations for dizygotic 
twins and the greater correlation for monozygotic twins reared apart than together are 
somewhat problematical, the overall pattern of results is clearly indicative of some 
hereditary determination of personality traits. Jinks & Fulker (1970) reanalysed the data of 
Shields by the biometrical method of analysis, and obtained heritability estimates of 54 per 
cent for neuroticism and of 67 per cent for extraversion. 

The experimental evidence from all the relevant twin studies was reviewed by Shields 
(1973), who concluded that nearly all the studies showed evidence of a significant 
hereditary component in extraversion, and many studies showed the same with respect to 
neuroticism or anxiety. Other reviews of the literature are available in H. J Eysenck 
(1976a) and Nichols (1978). 

In sum, it appears that Mischel has ignored a crucially important determinant of 
individual differences in personality, thus severely reducing the persuasiveness of his 
account of the origins of traits. A further important point is that, given the existence of a 
significant involvement of heredity in personality differences, any adequate theory of 
personality must take account of hereditary factors. It is not obvious how this could be 
done within the context of social learning theory (Mischel, 19734). On the other hand, 
trait-state theories have typically emphasized the point that personality traits involve some 
hereditary component. Indeed, a critical issue in contemporary personality theory is (or 
should be) the role played by heredity. Since the evidence indicates that hereditary factors 
are important in explaining individual differences in personality, and since the trait-state 
approach is almost the only major theory of personality that acknowledges that fact and 
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incorporates hereditary factors by means of the trait concept, it is incumbent upon theorists 
of different persuasions to address themselves to this issue. 

(5) Mischel (1968) pointed out that cross-situational behavioural measures rarely produce 
correlations in excess of +0-30, or 9 per cent of the variance, and thus it appears that 
behavioural inconsistency is the rule rather than the exception. While such relatively low 
reliability coefficients would undoubtedly embarrass a simple trait theory, there are two 
additional pertinent considerations. Finstly, in work discussed in more detail subsequently, 
Sarason et al. (1975) found in a review of almost 140 analyses of variance that personality 
accounted for approximately 9 per cent of the variance on average, and the situation for 10 
per cent. If one adopts very stringent criteria for the minimal percentage of the variance 
that a factor must account for in order to warrant further consideration, then there is the 
danger that researchers will discover that no factors at all are sufficiently important to 
consider! 

Secondly, Mischel (1968, 19734) has recognized that the task of predicting behavioural 
responses within a trait-state theory can proceed on the basis of ‘moderator variables’ 
(Wallach, 1962). The basic notion is that the influence of any particular trait on behaviour 
will usually be indirect, being affected or ‘moderated’ by a number of other traits, 
mediating variables and situational factors. Mischel has criticized this approach, arguing 
that the more moderators that are required to qualify a trait, the more a trait-based 
formulation resembles a relatively specific description of a behaviour-situation unit. While 
it is true that trait-state conceptualizations have become increasingly complex over the last 
few years, it could very well be argued in view of the complexity of human functioning that 
this is a necessary, and indeed inevitable, development. 

Evidence of some cross-situational specificity of behaviour can only be taken as highly 
damaging for state-trait theories that assume a direct one-to-one correspondence between 
internal traits and behavioural indices. Since most contemporary state-trait theories 
postulate the existence of moderator variables and thus claim only an indirect but 
theoretically predictable relationship between traits and behavioural responses, Mischel's 
evidence loses much of its apparent force. 

(6) Although it is desirable, as Mischel has emphasized, that those factors emphasized by 
a theoretical position should account for a sizeable proportion of behavioural variation, it 
1s also the case that there are various other criteria by which theories can, and should, be 
evaluated. One such criterion is a theory's range of applicability. In terms of that criterion, 
state-trait theories have often been outstandingly successful. For example, as H. J. Eysenck 
(1971, 19765) has shown, the personality dimension of introversion-extraversion has been 
found to be related to performance in a theoretically predictable way in the following, and 
other respects: sensory threshold; pain threshold; time estimation; sensory deprivation; 
perceptual defence; vigilance; critical flicker fusion; sleep-wakefulness patterns; visual 
constancy; figural after-effects; visual masking; rest pauses in tapping; speech patterns; 
conditioning; reminiscence, and expressive behaviour. Mischel’s criticisms suffer from the 
disadvantage of evaluating the state-trait approach from a rather limited perspective. 

(7) Mischel (1968) discussed a variety of research findings that appeared to demonstrate 
highly specific situational effects on behaviour. This apparent inconsistency of behaviour 
contrasts with a persistent tendency among most people to regard others as possessing 
stable and typical behavioural patterns. 

Mischel (1968) attempted to explain this apparent paradox by arguing that people tend 
to proceed far beyond the actual observation of some consistency in behaviour to the 
attribution of greater perceived consistencies which they construct. However, an alternative 
viewpoint is possible, starting with the fact that theoretical analyses and experimental 
paradigms have tended to incorporate the assumption that interrelationships between the 
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individual and the situation are unidirectional, i.e. the situation affects the person. Indeed, 
the typical laboratory experiment involves the manipulation of some experimenter- 
determined aspects of the situation (the independent variable) in order to observe the 
behavioural consequences. Since the subject does not control the situations to which he or 
she is exposed, only information about situational influences on behaviour can be obtained. 

Wachtel (1973) has referred to this model of research, with the behaviour of the 
experimenter intended to occur independently of the subject’s activities, as the model of the 
‘implacable experimenter’. Since it is, in fact, indisputable that there are bidirectional 
influences of the situation on the person and of the person on the situation, it may well be 
that experimental research systematically underestimates the consistency of behaviour that 
actually occurs under naturalistic conditions. 

(8) Mischel and his critics even seem to regard the figure of 0-30 as an average measure 
of consistency as meaningful in some way; it is difficult to see how this can be. Essentially 
Mischel is trying to prove a negative, i.e. conduct is not consistent. This clearly is not 
possible; even if all (n) attempts to discover consistency had been failures, the possibility 
that the (n+ 1) attempt may be successful is not ruled out. To average, as he has done, 
successful and unsuccessful attempts is meaningless; success clearly depends on having a 
theory essentially pointing in the right direction, choosing tests which are both reliable 
and valid, and applying them to an appropriate population under appropriate motivational 
circumstances. If even one of these (rare and unusual!) preconditions 1s missing, the failure 
of the experiment says nothing about consistency of conduct. 

This point has to be seen against the typical way in which research into personality is 
conducted. Without wishing to caricature the modal paper in this field, 1t would seem that 
a multiphasic test of personality traits is administered to a population of sophomores, and 
that then all the separate scores of this test are correlated with some criterion. Usually little 
by way of hypothesis is stated, and even when a hypothesis is allegedly tested, it does not 
usually specify a particular trait. Granted that one of the many traits measured by the 
multiphasic test is relevant to the ‘hypothesis’, the others most likely are not; yet the 
correlations with the criterion are usually calculated for all. Given a test measuring 16 
traits (like the 16 PF), and given that one trait is strongly related to the criterion, while 
the other 15 are not, averaging all the observed correlations must inevitably give a low 
mean r; it is this meaningless mean or modal figure that enters into such average figures as 
those quoted by Mischel. 

Two other possibilities must be considered. It is possible that none of the observed 
correlations with the criterion is high; this does not demonstrate lack of consistency of 
conduct, but merely indicates that the hypothesis tested (assuming there to have been one!) 
is erroneous. No further deduction can be made about consistency of conduct. Granted 
that in any novel field most hypotheses are likely to be in error, failures are to be 
anticipated; they should not be averaged out together with the successes. It has been shown 
that many different deductions can be made from well-established theories, such as that 
linking introversion with cortical arousal (H. J. Eysenck, 19765; M. W. Eysenck, 1977); 
these cannot or should not be argued away by averaging them with the numerous failures 
of other theories. 

Another important possibility is that the many traits measured by the multivariate test 
are correlated, and give rise to a much smaller number of 'superfactors'; thus the 
California Psychological Inventory (CPI), which has 18 scales, can be shown to measure 
essentially two major personality dimensions, i.e. neuroticism and extraversion (Nichols & 
Schnell, 1963). The question now arises of whether the specific variance of the 18 scales 
contributes anything to the measurement of various criteria over and above the common 
variance summarized in the N and E scores. Reynolds & Nichols (1977) have shown that 


Mischel and the concept of personality 197 


the answer is in the negative; the two superfactors contribute all the valid variance to the 
prediction of the criteria used in their study. We thus have identical degrees of consistency 
(test-criterion) regardless of whether we use 18 scales or two. In the usual way we would 
divide this consistency by 18, obtaining very small values, when we should really divide by 
two, thus obtaining much higher values for consistency! Thus do the customary methods of 
analysis, hallowed by time but unjustified by psychometric criteria, artificially lower quite 
drastically the apparent consistency of conduct. Such statistical artefacts must be guarded 
against. We conclude that Mischel’s summary figure for consistency is strictly meaningless. 

(9) Even if we could take the figure of 10 per cent test contribution to the variance as a 
measure of consistency seriously, it would still not follow that personality factors were 
relatively unimportant. Criteria of conduct are usually highly complex, and any particular 
personality trait by itself would not be expected to predict such complex behaviour 
perfectly. Let us assume that particular behaviours are determined by 10 independent traits 
of personality, each contributing 10 per cent to the variance; this would give us perfect 
predictive accuracy, even though each trait obeyed completely the Mischel limitations! It is 
not suggested that personality testing would ever achieve such high predictive accuracy, but 
some such argument of additivity (or even multiplicability) of traits has been demonstrated. 
High N shows quite different behaviour when accompanied by high or low E (H. J. Eysenck, 
1967); adding high or low P (psychoticism) will again change the outcome (H. J. Eysenck 
& S. B. G. Eysenck, 1976). Combining the particular combination of P, E and N with high 
or low intelligence will again powerfully determine the outcome, and so forth. Mischel 
criticizes a simplistic and unrealistic position never seriously taken by any personality 
theorist; he fails to come to grips with the actual position taken by Cattell, Guilford or 
Eysenck. As in chemistry, study of individual substances is a beginning, but it is not 
enough; we must also study their interrelations and combinations. No doubt the laws of 
combination present special difficulties, but the complexity of human (and animal!) 
behaviour requires us to abandon simplistic hypotheses and look at laws of combination 
which may be curvilinear, multimodal, and complex in many ways determined by the 
interaction of the traits originally measured. Mischel is aiming at a man of straw; he does 
not criticize modern personality theory as it really is. 

(10) Mention has already been made of the need to specify the population tested in 
order to test properly a particular hypothesis. Thus the Hartshorne & May studies of 
honesty and deceit, in using young children in whom learning and conditioning would not 
yet have succeeded in producing very consistent reactions to complex situations, loaded the 
dice in favour of ‘inconsistency’; that they still observed a considerable amount of 
consistency is evidence for the strong determination of conduct by personality. As H. J 
Eysenck (1970) has shown, later workers, using similar tests with older subjects, obtained 
much stronger evidence for consistency. Again, averaging over many different populations 
as Mischel does is essentially a meaningles procedure, and the result cannot be interpreted 
1n the way Mischel attempts to. 

(11) It 1s well known in psychometrics that correlations cannot be interpreted directly 
without some knowledge of the internal reliability of the scores correlated. Any attempt to 
estimate the relationship between two variables which relies on unreliable estimates of one 
or both may grossly underestimate the *true' correlation, and attempts should always be 
made to correct for attenuation. This is practically never done in the studies quoted and 
averaged by Mischel, although the reliabilities of the variables in question are often known, 
and frequently fall short of what might be regarded as adequate. For this reason, among 
others, the average correlation of 0-30 as a meaningful estimate of the *true' relationships 
in question must be regarded as underestimating these to an unknown but probably 
substantial extent. Cattell et al. (1970) give tables of predictive correlations for many 
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criteria; the small size of the reported reliabilities of the tests used suggests that the ‘true’ 
relationships would be several times as strong as those reported. 

(12) This argument becomes even stronger when we consider that much of the work 
criticized by Mischel attempts to predict real-life behaviour. Such predictions can come to 
grief for several reasons, among which is the obvious one that there is in fact no 
consistency of conduct such as is implied in the theory that reasonably high correlations 
will be found; this is in practice the only one considered by Mischel. It is equally possible, 
as has already been mentioned, that the wrong theory has been tested, or that the wrong 
tests have been used, or that the tests used were unreliable. What is equally possible, and 
often demonstrably true, is that the criterion may be excessively faulty, i.e. either invalid or 
unreliable. Educational criteria are famous for their lack of reliability (Hartog & Rhodes, 
1936); other real-life criteria in industry, psychiatry and elsewhere often share this fault. 
Unless we have reason to believe that our criteria are both reliable and valid, the failure of 
tests to predict these criteria adequately cannot be used as proof of the inconsistency of 
conduct. 


States and traits as intervening variables 


A question that is fundamental to any assessment of the state-trait approach to personality 
is that of deciding upon adequate criteria for evaluating the adequacy of state and trait 
constructs. One possible criterion was favoured by Mischel (1969). He regarded 
cross-situational correlations as being of prime relevance, and concluded as follows: ' (There 
is] impressive evidence that on virtually all of our dispositional measures of personality 
substantial changes occur in the characteristics of the individual over time, and, even more 
dramatically, across seemingly similar settings cross-sectionally' (p. 1012). 

Mischel here disregards not only, as already noted, the powerful genetic evidence for 
hereditary determination of conduct; he fails to pay attention to the many studies 
demonstrating impressive consistency of conduct over time, often starting in early 
babyhood (e.g. Burt, 1965; Thomas & Chess, 1977). There are indeed changes; what is 
surprising is that there is also so much consistency. It is easy to be blinded by either the 
degree of consistency or of inconsistency, and to conclude that personality is more or less 
important in describing and determining conduct than it really is; such temptations must 
be resisted. In certain areas there 1s more longitudinal consistency than in others; we must 
seek quantitative evidence, but we must resist the temptation to calculate meaningless 
averages over all conditions, thus lumping the consistent areas with the inconsistent. 

Since both state and trait concepts clearly represent intervening variables, it is reasonable 
to consider whether there are other ways of justifying the postulation of intervening 
variables, over and above the use of cross-situational correlations. À classic analysis of this 
issue was presented by Miller (1959). He pointed out that if an experimenter only has 
information about the effect of a single independent variable on a single behavioural 
measure (e.g. situational stress on difficult-task performance), then it is simplest to 
represent the relationship as a direct one of stress on performance (i.e. stress affects 
performance), since this account involves only a single functional relationship. Under these 
circumstances, the introduction of an intervening variable (e.g. anxiety) would merely 
complicate matters by requiring one to refer to two functional relationships (i.e. stress 
causes anxiety; anxiety affects performance). 

In terms of the number of functional relationships required, the break-even point occurs 
when there are two independent variables and two dependent variables, since four 
functional relationships are required whether an account incorporating or not 
incorporating intervening variables is preferred. If an experimenter has information about 
three independent variables and three dependent variables, however, then two potential 
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Ego-involving Physiological 
instructions responding 

Blame Mood questionnaire score 
Punishment Distractibility 
Ego-involving Physiological 
instructions responding 

Blame Anxiety Mood questionaire score 
Punishment Distractibility 


Figure 1, Two ways of describing interrelationships among independent and dependent variables: (a) 
without postulating an intervening variable; (b) with postulation of an intervening variable. 


advantages can accrue from the use of a unifying intervening variable. Firstly, there is a 
real improvement in efficiency from nine functional relationships if no intervening variable 
is postulated down to six functional relationships if one 1s postulated (see Fig. 1). Secondly, 
the intervening-variable approach allows for experimental testing and possible disproof of 
the notion that a single intervening variable can account for the data. For example, if a 
given level of blame produces more distractibility than particular ego-involving 
instructions, then it should also produce greater anxiety on a mood questionnaire and 
stronger physiological responses. 

This basic method of justifying state and trait constructs has been employed successfully 
with respect to several major personality dimensions, including extraversion and 
neuroticism (H. J. Eysenck, 1967) and state and trait anxiety (Spielberger, 1972) Use of 
this method would seem to provide a satisfactory way of refuting the oft-expressed criticism 
that trait concepts are inherently circular, an allegation forcibly put by Wiggins (1973): 
‘Perhaps the most objectionable feature of the trait construct. ..is the manner in which 
traits are construed as hypothetical entities which cause behaviour. The objection is. .to 
hypothetical constructs that are animistic and circular, and that direct attention from 
lawful empirical relationships' (p. 366). As already indicated, once a trait construct is used 
to explain the diverse effects of several independent variables, as in Fig. 1, then that trait 
construct is clearly no longer circular or tautological. That this 1s the case can be seen from 
the fact that empirically testable predictions can be made. : 


Persons, situations, and their interaction 


The fact that individual differences in personality often account for relatively modest 
percentages of behavioural variance has led several researchers to investigate whether 
situational factors might account for substantially larger percentages of the variance In 
addition, there is an increasing awareness of the importance of interactions between the 
individual and the situation (e.g. Magnusson & Endler, 1977). Mischel (1973a) has argued 
that these interactions reflect idiosyncratic and theoretically unpredictable interrelationships. 
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The general expectations from the state-trait position are that situational factors will not 
prove to be substantially more consequential than person factors, and that many of the 
observed interactions between persons and situations are both replicable and theoretically 
predictable. 

Several studies have been done in an attempt to evaluate the relative importance of the 
person and of the situation in determining observed behaviour. Bowers (1973) discussed 11 
such studies that had calculated the percentages of the total variance accounted for by 
various factors. The mean percentage of the variance accounted for by persons was 11-27 
per cent, compared with 10:17 per cent for situations, and 20-77 per cent for the interaction 
between persons and situations. Thus, the data suggested that situations and persons were 
of comparably modest consequence as behavioural determinants, whereas the interaction 
between the two factors was approximately twice as influential as either of the main effects. 

A similar approach was adopted by Sarason et al. (1975), with the exception that they 
considered only studies reporting data on both personality and situational factors. The 
difference is that an interaction between persons and situations comprises a composite of 
all possible interactions between personality characteristics and situations for the particular 
situations of interest. Sarason et al. (1975) assessed the proportions of behavioural variance 
in each of 138 analyses of variance by means of the omega-squared statistic. On average, 
the situation accounted for 10-3 per cent of the variance, personality for 8-7 per cent, and 
the interaction between the situation and personality for 4-6 per cent. Thirty-five per cent 
of the situational main effects accounted for more than 10 per cent of the variance, whereas 
29 per cent of the personality main effects and 11 per cent of the personality by situation 
interactions did so. 

Before discussing the theoretical relevance of these data, it is necessary to point out their 
limitations. As Golding (1975) has noted, while the omega-squared ratios used in these 
studies do technically index the percentage total variation, they are inappropriate as 
measures of the general significance of the various factors. It would be possible, for 
example, for the reliability of a trait to be perfect across two situations, and yet still have 
individual differences account for only a small proportion of the variance. If the trait were 
running ability and the two situations were the 100 metres and the marathon, then clearly 
the dependent variable (running time in seconds) would be primarily affected by situational 
factors even if the rank order of performance were identical across situations. In other 
words, by suitable selection of situations and of personality dimensions, any pattern of 
results could be obtained. 

In spite of these shortcomings, the data clearly suggest the inadequacy of any approach 
that fails to utilize both individual-difference and situational factors in the theoretical 
explanation of observed behaviour. A more challenging issue, however, concerns the proper 
interpretation of the interaction terms (i.e. persons by situations, and personality by 
situations). One extreme position on this issue was taken by Mischel (19734): ‘When 
interpreting the meaning of the data on Person x Situation interactions and moderator 
variables, it has been tempting to treat the obtained interactions as if they bad 
demonstrated that people behave consistently in predictable ways across a wide variety of 
situations... The available data on this topic now merely highlight the idiosyncratic 
organization of behaviour within individuals, and hence the uniqueness of stimulus 
equivalences for each person' (p. 258). We would argue, on the contrary, that replicated, 
theoretically predictable, interactions between individual-difference variables and 
situational factors have frequently been found, indicating that behaviour is by no means 
idiosyncratically organized within each individual. 

One example of a consistently obtained interaction between a personality variable and a 
situation factor is discussed by M. W. Eysenck (1976). The Yerkes-Dodson Law stated 
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that task performance is interactively determined by arousal and task difficulty: the more 
difficult the task, the lower the optimal level of arousal. If one assumes that introverts are 
chronically more aroused than extraverts (H. J. Eysenck, 1967), then it is reasonable to 
predict that introverts will outperform extraverts on simple learning tasks, but that the 
reverse will occur on difficult tasks. This interaction has been obtained at least eight times 
(Siegman, 1957; Jensen, 1964; Shanmugam & Santhanam, 1964; McLaughlin & Eysenck, 
1967; Howarth, 1969a, b; Bone, 1971; Allsopp & Eysenck, 1974). 

Further examples of consistently replicated and theoretically predicted interactions 
between personality and situational factors are reviewed by Costello (1964). He cited six 
studies that had obtained a significant interaction between instructional stress (neutral 
versus ego-involving) and trait anxiety, with the more motivating instructions being 
detrimental to performance for high anxiety subjects and facilitatory for low and medium 
anxiety subjects. A similar interaction between a different stress factor (absence versus 
presence of blame) and trait anxiety was obtained in seven different studies discussed by 
Costello (1964). 


Personality as an indispensable part of experimental and applied psychology 


Our discussion thus far has been directed towards demonstrating that Mischel's criticisms 
of personality theory and application 1s either non-factual or even anti-factual. In this last 
section we shall argue that the inclusion of personality variables in empirical studies of 
psychological problems is not only permissible, but mandatory. The evidence is by now 
very strong that personality enters into predictions made in experimental, social, abnormal, 
industrial, educational and pharmacological psychology to such an extent, and in so 
predictable a manner, that 1t is possible to demonstrate the effects of neglecting it as a 
variable interacting with the ‘main effects’ of experimental manipulation (e.g. H. J. 
Eysenck, 1967, 19766, 1978; Broadhurst, 1978). The effect is to reduce the ‘main effects’ 
portion of the total variance to a relatively small proportion, and to inflate the error 
variance to an unacceptable degree. When personality is included explicitly, and in line 
with theoretical prediction, in the design, then much of the so-called error variance 1s 
recognized as being truly main effects x personality interaction variance, and hence useful in 
writing proper prediction equations. Many studies are cited in the references given above 
where the main effects part of the variance is completely absorbed in the error variance 
when personality is excluded from the experimental design, but where all the main effects 
appear in the interaction with personality. Such startling evidence cannot be omitted from 
any discussion of the consistency of personality; both consistent (trait) and inconsistent 
(state) measures of personality are implicated (M. W. Eysenck, 1977). 

It is possible to go further than this and to suggest that many of the theoretical battles 
which have been fought in psychology owe their origin and their very intransigence to the 
fact that the organisms studied by the proponents of the divergent theories were genetically 
different. Thus Jones & Fennell (1965) and H. J. Eysenck (1967) have suggested that the 
great debate between the followers of Tolman on the one hand, and of Hull and Spence on 
the other, regarding the major laws of learning, may have been sparked off and sustained 
by the choice of rats of the 'emotional' strain of C. S. Hall by Tolman, and of rats of the 
'non-emotional' strain by Hull and Spence! Emotional rats behave in a fashion that gives 
support to S-S theorists like Tolman, while non-emotional rats behave in a fashion that 
gives support to S-R theorists like Hull and Spence. Some evidence for this suggestion 1s 
provided in an experiment reported by Jones & Fennell (1965), and the suggestion that 
theoretical quarrels may be due to differential selection of (animal or human) subjects finds 
support in many other studies (e.g. H. J. Eysenck, 19765). 

It may be further suggested that many theoretical suggestions fail to be confirmed, or 
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that experimental findings arising from such theoretical suggestions fail to be replicated, 
because personality factors are not taken into account. Thus for instance Shigehisa e! al. 
(1973) have shown that the failure of many investigators to obtain clear-cut evidence for or 
against the hypothesis that sensory thresholds can be lowered by increasing sensory input 
in modalities other than that being tested, was due entirely to their failure to separate their 
subjects into extraverts and introverts; extraverts do show this tendency, whereas introverts 
show the opposite tendency, thus demonstrating, as predicted, Pavlov's transmarginal 
inhibition. No replicability can be expected from studies that do not include consistent 
behaviour patterns (in this case, introversion-extraversion) in their design. Even more 
impressive than the mere empirical finding is the fact that the finding was predicted on the 
basis of personality theory; it seems difficult to argue that consistent behaviour patterns do 
not exist when a theory postulating such patterns can be used to generate predictions so 
amply confirmed. 

What has been said here harks back of course to Cronbach's (1957) Presidential Address 
to the American Psychological Association concerning ‘The two disciplines of scientific 
psychology'. This address received many plaudits, but was in effect disregarded by most 
experimental psychologists on the one side, and by most personality psychologists (to coin 
a term which embraces all those who use correlational methods in an attempt to study 
individual differences in traits, abilities, or attitudes). Mischel's critique of personality 
theory has encouraged this unfortunate break between the two disciplines of scientific 
psychology; it is our hope that this reply to his criticisms may go some way towards 
healing it. Experimental psychology (and social, educational, abnormal, industrial and any 
other kind of psychology) is unreasonably handicapped by disregarding personality factors 
in the widest sense; personality theorists are unreasonably handicapped by not paying 
attention to the major findings of experimental psychology, and gearing their conceptions 
to the concepts most widely used there. 


Summary and conclusion 


Of the various criticisms made of state-trait approaches to personality, the central one is 
that the actual inconsistency of behaviour contrasts with the prediction from the state-trait 
approach of behavioural consistency. However, since most state-trait theories argue that 
there 1s consistency at the intervening-variable level rather than at the behavioural level, 
this criticism is not damaging. 

If state-trait theories cannot be satisfactorily evaluated by measures of cross-situational 
consistency, then the issue becomes one of proposing suitable criteria. Among those that 
merit consideration are the following: 

(1) Trait constructs should have a demonstrated hereditary component. 

(2) Trait and state constructs should account for the interrelationships among several 
independent and dependent variables. 

(3) Trait and state constructs should have a wide range of applicability. 

(4) Interactions between personality and situational factors should be theoretically 
predicted and replicable. 

In essence, many state-trait theories emerge with credit when evaluated against the set of 
criteria discussed in this article. The problem is that Mischel and others have evaluated 
state-trait approaches from a limited and sometimes irrelevant perspective. 
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‘Behavioural interaction’ and ‘interactional psychology’ theories of 
personality: Similarities, differences, and the need for unification 


Arthur W. Staats 





The social behaviourism theory of personality and behavioural interaction is presented and compared 
to interactional psychology’s conception, indicating the features shared. Interactional psychology’s 
conception is seen to be inconsistent with its person-situation data. Social behaviourism’s theory 1s 
seen to provide better analysis and stipulation of personality, situations and interaction. The social 
behaviourism theoretical developments should be employed in the new approaches attempting to 
combine behaviour principles with personality concepts of a cognitive-emotional type. Social 
behaviourism’s multi-level theory and its extension into various areas of psychology provide many 
avenues for development. Finally, the separation of interactional psychology’s development from the 
related principles of social behaviourism is seen as a drawback produced by the present separatism of 
United States psychology’s preparadigmatic state of development. 





Watsonian behaviourism was revolutionary in its rejection of traditional types of 
knowledge of man. Its rejection of ‘mentalism’, still followed in contemporary operant 
behaviourism, involved discarding concepts such as personality, attitudes, emotions, 
cognitions and so on, at least as causal processes that determine human behaviour. Skinner 
(1975, p. 43) has followed this radical behaviourism, even very recently stating that 
emotions have nothing to do with the causation of behaviour. He grants that we have 
emotions that are the effects of environmental conditions, but denies that emotions can help 
determine our behaviour. In general, Watsonian behaviourism considers personality as 
behaviour (see Keller & Schoenfeld, 1950, p. 366); which makes personality an effect, not a 
cause. 

Traditional conceptions, in contrast, treat personality as a cause of the individual’s 
behaviour. A schism in views is involved that has divided psychology for many years. In 
contrast, a third-generation behaviourism has been in formulation over a number of years 
that has attempted to provide a unifying theory that would breach the schism. This 
unification has occurred along many dimensions, in many areas. While this new 
behaviourism, called social behaviourism, has contributed to the development of behaviour 
modification and social learning theory (see Staats, 1957a, b, 1963, 1973, in press), it has 
also introduced concepts of personality theory and cognitive theory. It is suggested that 
these new concepts of personality and interaction (the behavioural interaction approach) 
provide the basis for a rapprochement with traditional approaches to psychology — and 
that this rapprochement will come to characterize the new generation of behaviourism. 

To continue, however, there are contemporary theory developments in psychology that 
have begun to follow this development of social behaviourism, at least in certain 
characteristics, one being the attempt to effect a rapprochement between traditional 
concepts of personality and cognitive psychology with behavioural principles. (Other 
approaches — see Dollard & Miller, 1950; Eysenck, 1967 — that may be considered 
learning-personality rapprochements will not be considered here.) These later developments 
have not related themselves to the earlier social behaviourism and the result has been a 
detriment to theory development in various ways. One such important new development is 
called interactional psychology. For several reasons interactional psychology should be 
related to the behavioural interaction theory of personality of social behaviourism. The 
purposes of the present article are thus to (1) review social behaviourism’s behavioural 
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interaction theory of personality and relate it to the principles of interactional psychology, 
(2) indicate the similarities of the two approaches, (3) outline the differences in the theories, 
(4) indicate some of the lacunae and other weaknesses in interactional theory — for example, 
that it employs two unrelated and inconsistent concepts of interaction, (5) indicate how it 
would be productive to employ social behaviourism’s behavioural interaction theory to 
further develop interactional psychology, and (6) indicate that the separation of 
interactional psychology from social behaviourism has been detrimental to the theory 
development in various ways, and this involves a general feature of psychology. 


Social behaviourism’s personality theory: The behavioural interaction conception 


Social behaviourism is a true behaviourism in its objective specification of its concepts and 
principles. Its basic principles are those of classical and instrumental conditioning — 
interrelated in a new theoretical structure called three-function learning theory (Staats, 
1968 b, 1970, 1975). Its background includes the works of the first generation of 
behaviourists (Pavlov, Thorndike and Watson) and second-generation behaviourists (such 
as Tolman, Hull and Skinner). Social behaviourism is a third-generation behaviourism, 
however, in various ways, one of which involves the concept of personality and the 
principles of interaction. As has been indicated, one of the purposes of social behaviourism 
has been to provide a theory within which the schism between radical behaviourism and 
traditional theory can be resolved. 

The approach that is now called social behaviourism began in its first general statement 
to deal with this fundamental schism in psychology. Beginning with the basic principles of 
learning, theoretical analysis of such aspects of human behaviour as language (Staats, 
1963, ch. 4) and motivation (Staats, 1963, ch. 7) were formulated. Other behaviourists, for 
example Skinner (1957), have analysed aspects of language in terms of learning. Social 
behaviourism was distinctly different, however. It treated the functions that language had 
for the individual in his behaviour, not just the way that language was learned. The social 
behaviourism theory of language outlined the repertoires that constituted language. Then 
the theory showed how the individual's reasoning, his planning, his self-concept, and his 
intelligence were composed of such repertoires (and others). The manner in which these 
language repertoires functioned to affect the individual's behaviour were indicated, as well 
as the way that others responded to the individual's language. The individual's basic 
behavioural language repertoire was seen to constitute in large part the individual's 
cognitive personality processes. 

The same was true in treating the topic of emotions and human motivation. Other 
behaviourists have been concerned with the basic principles of animal motivation. For 
example, Hull considered deprivation to produce drive which then energized the 
individual's habits. Social behaviourism also provided its basic theory of motivation. But 
social behaviourism did not stop there. The account continued on to develop a theory of 
emotions and motivation as a personality construct. It was said that the individual learned 
a system of stimuli that elicited an emotional response in him — that the system differed for 
individuals and groups. Moreover, the theory indicated how this system — the person's 
‘reinforcement’ system — had personality functions for him. His reinforcement system (later 
called his A-R-D system, as will be seen) determined how he behaved, in various ways, 
including ways that were important for understanding abnormal behaviour. Again, the 
concept of personality was that of the basic behavioural repertoire. In this case the basic 
behavioural (personality) repertoire consists of the various stimuli that elicit emotional 
responses (positive or negative) in the individual. 

To continue, the social behaviourism theory, even in its 1963 level of development, 
indicated the interaction that was involved between the individual's personality and the 
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environment, stated in explicit behavioural terms.* As one example, the individual's 
self-concept was said to be influenced by the environment - the manner in which the 
self-concept was learned was specified. In addition, however, the self-concept was seen in 
turn to have effects on the social environment, and these environmental effects would in 
turn act back upon the further formation of the individual's self-concept. 


First, it is suggested that the types of statements a person makes about himself, the things he can 
do, how well he can do them, and. ..to which we refer as self. .. may influence the way other 
people respond to the person. A person who makes deprecatory statements about himself is 
likely to be responded to in a manner appropriate to those statements. . .In addition, the 
statements the individual makes about himself would be expected to function as a stimulus that 
would control his own behavior. . . Thus a distorted set of self-statements may mean that the 
individual behaves in accord with the statements. . .[T]he individual who makes “confident” 
self-statements may, other things being equal, attempt the task on the basis of his reasoning and 
be successful. The self-confident individual may also, because of his verbal behavior, tend to gain 
more attention, social approval, and social rewards such as raises and honors. These experiences 
could reinforce the verbal behavior and act as [incentives] for further “self-confident” 
self-description. . .[T]he individual's “self-concept” may be a determinant of the individual's 
behavior and, further, of the stimulus conditions the individual will experience (Staats, 1963, 

pp. 265-266). : 


The social behaviourism analyses included numerous and detailed examples of the 
principles of interaction, for example, the manner in which abnormal behaviour can 
develop, in that the individual’s behaviour elicits responses from others that tend to 
increase the abnormal behaviour (Staats, 1963, pp. 386-389). An earlier account had 
outlined such an interaction in the acquisition of a schizophrenic’s abnormal language 
(Staats, 19575). 

A central principle in the interaction conception introduced into the social behaviourism 
account — a principle that provided the basis for the extension of the interaction conception 
and for the introduction of the full concept of personality into a behaviouristic approach — 
was that behaviour was an independent variable (a cause) as well as a dependent variable 
(an effect). In a section in the treatment of personality entitled ‘The self as an independent 
variable’, it was said that ‘In these terms one’s self statements, or the self, could be 
considered to be an independent variable that would control one’s own behavior and the 
behavior of others’ (Staats, 1963, p. 265). As will be indicated, this theoretical approach 
has been elaborated in a number of ways that are relevant herein. First, however, we will 
turn to the consideration of the development of the theory which is presently referred to as 
interactional psychology. 


Interactional psychology 


The characterization of the modern interactional psychology approach has considered 
social learning theory to have contributed essential concepts to the approach (Ekehammar, 
1974; Endler & Magnusson, 1976). It should be noted that contemporary social learning 
theory did not originally formulate either the principles of interaction or a concept of 
personality as a causal mechanism. The thrust of social learning theory (Bandura & 
Walters, 1963; Mischel, 1968, 1971), in these respects, was in the tradition of radical 
behaviourism. The social learning development of this position (see Mischel, 1968) was 
thus recognized to exclude personality, as indicated in later works calling the position 
situationism (Alker, 1972; Bowers, 1973). Ekehammar (1974, p. 1035), for example, states 
that Mischel’s position was essentially ‘a situationist [B = f(S)] conceptualization of 


* It should be noted that Rotter (1954) introduced the concepts of personality and also interaction into a 
learning theory orientation. His approach should be considered an antecedent of the present conception, although 
the present theory is different in major characteristics. 
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personality, which means that it is “essential to study the differences in the behaviors of a 
given person as a function of the conditions in which they occur" [Mischel, 1968, p. 296]. 
Prior to 1968 the social learning theory approach of Mischel and Bandura, unlike the social 
behaviourism theory already described, included no concepts of behavioural interaction, or 
of personality as a causative process, of the types that were later considered to be 
foundations of the new interactional psychology. 

In a 1968 social learning theory treatment of abnormal psychology, however, that was 
centrally based upon prior social behaviourism concepts and taxonomy (see Staats, 1963, 
ch. 11), Bandura indicated concurrence with the social behaviourism concept of interaction 
as the following statement indicates. 


In view of the demonstrated importance of stimulus control over behavior, a social learning 
taxonomy (classification) of psychopathology must attend to the interaction between behavioral 
predispositions (subject variables), on the one hand, and stimulus events, on the other. This type 
of analysis (Staats and Staats, 1963) can help both to explain the acquisition and maintenance of 
deviant response patterns and to guide therapeutic practices (Bandura, 1968, p. 298). 


Later the principles of behavioural interaction — where the individual's behaviour affects 
the social environment, which in turn effects the individual's later behaviour — were given 
the label of reciprocal influence and reciprocal determination in Bandura's approach 
(Bandura, 1969). Although Bandura has emphasized reciprocal determinism in his later 
accounts, he has not yet presented the principles in detail or indicated their conceptual or 
experimental foundations (see also Bandura, 1978), as will be indicated. Bandura (1978) 
has also introduced a concept of personality into his theory in very recent times. The 
relationship of this concept to the earlier concept of social behaviourism will be indicated 
further on, as will the concept of personality that in recent years has been added to the 
social learning theory of Mischel. That is, Mischel has also reversed his previous radical 
behaviouristic position of situationism in a 1973 article entitled ‘Toward a cognitive social 
learning reconceptualization of personality’. In addition to adding a personality concept he 
included the principle of interaction in this later account, in a manner very like the 
preceding behavioural interaction conception of social behaviourism (see especially Staats, 
1971a). Both Mischel (1973) and Bandura (1978) have said that social learning theory 
never intended to reject the concept of personality or to exclude the concept of interaction, 
but until the recent changes social learning theory had these characteristics of radical 
behaviourism (Alker, 1972; Bowers, 1973). 

The new field of interactional psychology has absorbed aspects of the behavioural 
interaction principles, referring them to social learning theory, and it has other 
characteristics like those of social behaviourism as well, as will be summarized. Endler & 
Magnusson list four main features of the modern interactional psychology conception, and 
these may be summarized briefly to indicate the overlap between these principles and the 
same principles in social behaviourism. 

The first principle is ‘Actual behavior is a function of a continuous process or 
multidirectional interaction (feedback) between the individual and the situation that he or 
she encounters’ (Endler & Magnusson, 1976, p. 968). The manner in which the behavioural 
interaction between the individual's personality repertoires and the environment has 
already been exemplified, beginning with analyses made in 1963. The second feature of the 
new interactional psychology is that ‘The individual is an intentional active agent ın this 
interaction process’ (Endler & Magnusson, 1976, p. 968). The behavioural interaction 
conception of social behaviourism has included 'self-direction' as a central element by 
which it is differentiated from traditional behaviourism approaches. ‘Any conception that 
suggests that we do not have a hand in determining what we do fails at the very 
beginning...Humans are not just automatons and we resist a conception that suggests this, 
as traditional learning theories have’ (Staats, 1971a, p. 251). 
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The third feature of interactional psychology is stated as follows. ‘On the person side of 
the interaction, cognitive factors afe the essential determinants of behavior, although 
emotional factors do play a role' (Endler & Magnusson, 1976, p. 968). As indicated in the 
above description of social behaviourism, behavioural interaction theory has stressed the 
language-cognitive repertoires and the emotional-motivational repertoire and, as will be 
indicated further on, social behaviourism has provided detailed and specific treatment of 
these personality areas — with supporting research. A third personality repertoire has also 
been treated, that is, the sensory-motor personality repertoire (Staats, 1963, 1968a, 1971a, 
1975). 

Finally, Endler & Magnusson list the fourth feature of interactional psychology as 
follows. ‘On the situation side, the psychological meaning of the situation for the individual 
is the important determining factor’ (1976, p. 968). Social behaviourism has elaborated 
various mechanisms to account for individual differences in internal response to situations 
and has indicated how these personality differences determine differences in behaviour (see 
Staats, 19685, 1975). As an example, Staats et al. (1973) conducted a series of three 
experiments to show how differences in individual's emotional response to situations would 
determine their differences in behaviour. One of these experiments indicated the 
mechanisms and principles by which the psychological meaning of the alternative stimuli 
for the individual will determine his choice behaviour and thus the stimuli that he 
experiences. It may be added that Endler & Magnusson discuss consistency in behaviour 
across situations (something that Mischel, 1968, had rejected) in terms of the individual's 
choice of the situations he encounters (1976, p. 961). Social behaviourism has always been 
concerned with personality consistency and the mechanisms involved. Rotter (1954) has 
also discussed personality consistency in terms of choice of new experiences. As will be 
indicated, interactional psychology does not explicate or support the concept of consistency 
through choice with research. 


Differences in approach 


The preceding sections have indicated that there is overlap in principles between ' 
behavioural interaction theory and interactional psychology theory. In considering the two 
approaches, however, it is important to realize that there are very basic differences: in the 
two theories that can be indicated by comparison. 

To begin, interactional psychology actually reflects two types of traditions. The work of 
individuals such as Endler & Magnusson has previously been largely in the traditional field 
of personality, not in behaviourism or in a behavioural approach. One of the traditions of 
interactional psychology, thus, has been in the use of traditional personality tests in the 
study of situation-personality interaction. The other tradition in the new interactional 
psychology involves the inclusion of the principles of behavioural interaction that now 
constitute the core of the conceptual aspects of the approach. The fact is that two quite 
different types of tradition have been intermixed here, but not unified. As presently stated, 
the two types of interaction are actually inconsistent at basic points, as will be indicated. 


The statistical artifact interaction of interactional psychology 


The data base of interactional psychology (Endler & Magnusson, 1976) resides in studies 
that suggest that in addition to personality causal variables and situational causal 
variables, there is interaction causality. This interaction concept is not acceptable to social 
behaviourism; from this viewpoint the data on interaction in interactional psychology 
actually may be seen as a design or statistical interaction, not as a third source of causality. 
This can be exemplified by reference to the type of data that characterizes the interactional 
psychology approach. That is, Endler & Magnusson cite studies by Baron & Ganz (1972) 
and Baron et al. (1973) to characterize the concept of interactionism. Using a personality x 
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treatment interactional design, ‘they found that internals perform better with intrinsic 
feedback’ than do externals (Endler & Magnusson, 1976, p. 965). As another example, 
Gilmor & Minton (1974) found that under a success situation internal-locus-of-control 
subjects attributed success to internal factors more than did externals, but internal 
attribution was reversed under the failure situation. Other experiments have found 
interactions between anxiety trait measures and situational variations and between aptitude 
measures and type of teaching situation (Endler & Magnusson, 1976, pp. 965-966), and so 
on. They conclude, ‘These studies suggest that interactions rather than main effects may 
represent the true state of affairs, and seriously question the trait hypothesis that the rank 
order of individuals is stable across situations’ (Endler & Magnusson, 1976, p. 966). 

However, the concept of interaction that is derived from these data, from the standpoint 
of the social behaviourism analysis, appears to have no relationship to the behavioural 
principles of interaction also included in the approach. The fact that internals attribute 
more success to themselves than do externals in success situations, and vice versa, for 
example, in no way serves as a basis for deriving or validating the behavioural principles 
that are at the heart of interactional psychology. What, one may ask, do these data have to 
do with the principle that the individual’s behaviour affects the environment, that the 
environment further affects his behaviour, and so on. None of the other of the four social 
behavioural principles described by Endler & Magnusson (1976, p. 968), as already 
excerpted in the previous section, is derived from or validated by the data showing a 
statistical interaction between personality measures and situational conditions. 

Support of the behavioural interaction principles has to come from a source other than 
the experiments and data of interactional psychology or social learning theory. However, 
Endler & Magnusson do not refer to a data base to support the principles of behavioural 
interaction that are central to the position they describe. Nor are analyses of cases of 
human behaviour given and behaviour analyses made that involve the principles of 
behavioural interaction included in their presentation. It is suggested, thus, that the 
interactional psychology position summarized and focused by Endler & Magnusson is 
presently a potpourri of disparate considerations. It shares with social behaviourism’s 
behavioural interaction theory the impulse to decrease the schism between traditional 
approaches to personality and behaviourism. But it does so by combining things that do 
not harmonize. Although it aims towards integration, it does not meaningfully integrate its 
disparate elements. It only has the semblance of a unified theory, and it does not have data 
to support its central principles. Attainment of unity, in the present view, can be gained 
only through unified analysis at a basic level. If interactional psychology accepts this path, 
it may require progressively approaching more and more closely to a social behaviourism 
position. 


The direct interaction of behavioural interaction 


The behavioural interaction principles of social behaviourism do not include the same data 
base as that of interactional psychology. The interaction concept is a behavioural concept. 
The statistical concept of interaction, moreover, may be considered to be inconsistent with 
the principles of behavioural interaction. This can be exemplified by reference to a 
hypothetical experiment. Let us say that we take a group of white dogs which we condition 
to press a bar in the presence of a higher tone, and another group of black dogs which we 
condition to press a bar in the presence of a lower tone — both tones being very high and 
thus inaudible to the human ear. Later, we use the dogs in a classroom experiment, having 
the students tabulate the number of bar pressing responses the dogs make under two 
conditions — yellow overhead light when the high tone is presented and green light when 
the low tone is presented — with no reinforcement. There are thus two main conditions that 
are observable to the students, the dogs' colour and the light colour in the chamber, the 
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tone conditions are unknown to the observers. Let us say that one light is for the animals 
much more pleasant than the other, and that the black dogs are much more active than the 
white dogs. In analysing the data, using a 2 x 2 analysis of variance, the students would 
find some variance due to the person variable (type of dog) and some to the situational 
variable (the type of light). But most of the variance would reside in the interaction between 
person and situation. The likely conclusion would be that there were situation factors, 
person factors and a third realm of causation, the interaction of the two. 

Now, although the relevant term in the experiment is an interaction, this is a statistical 
artifact of the design of the experiment. We can see that there is not an interaction in a 
meaningful, psychological sense. The actual causative variable in the interaction term 
concerns the conditioning of a response to the specific stimulus. The students, who are 
ignorant of this actual causative variable, and who observe the large interaction term in the 
design, would conclude that there must be some interaction process between light colour 
and the two types of dogs. This would be erroneous. It is suggested that we have an 
analogous situation in the interaction personality studies. When we see, for example, a 
personality x treatment interaction between internal-external personality types and 
success-failure situations, this should not be taken to indicate that the causation is any 
different than usual. It means only that we do not know what the personality factors and 
the situational factors are. The causation may be just as direct as in the dog study, where 
the interaction is a design artifact, and there is no basis for assuming an interaction 
process. In the present view personality tests do not ordinarily tell us in basic terms what the 
differences are between the subjects. Yt is suggested that analysis of and stipulation of what is 
meant by personality and what is meant by the situation would reveal the direct lines of 
causation, without resort to an unspecified interaction interpretation. It is only the absence 
of that analysis in interactional psychology that leads to the assumption of interaction 
causation. Thus, social behaviourism's behavioural interaction conception suggests that 
modes of causation are direct. Unanalysed interactions indicate ignorance of direct causative 
relations. 


The lack of analysis of interactional psychology 


It has been suggested in several examples already that the interactional psychology 
approach is not analytical. The question of analysis is central in differentiating social 
learning theory generally from social behaviourism (and it might be added in differentiating 
social behaviourism from other behaviourisms, such as operant behaviourism), and the 
distinctions should be drawn, as indicated in the following sections. 


The environment. To begin, it has been suggested that the interactional psychology position 
recognizes the influence of (1) the environment, (2) the person, and (3) interaction. Let us 
for the moment ask some of the questions that arise in terms of making such an unspecified 
conceptual statement into an heuristic theory. It is said in interactional psychology that the 
environment has an effect upon behaviour. Such a statement would ordinarily arise out of 
extensive research on the manner in which the environment does affect behaviour. Since 
environments are quite complex the simple statement that the environment has an effect 
would not be very informative without knowing something about the environment as well 
as the principles involved and the nature of its effects. Endler & Magnusson (1976) agree 
with the need for considering in greater detail the constituents of situations. But they do 
not indicate how this is to be done. Analysis of the basic characteristics of what the 
environment is and the principles by which it has its effects is the province of learning 
theory. It is suggested, thus, that interactional psychology faces the task of looking to the 
study of learning for its basic theory, or of remaining on a vague, naturalistic level. What 
course is planned must be indicated. In either case interactional psychology must either 
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employ that which is already present — for example, one of the present learning theories — 
or it must innovate the needed foundation. 


Personality. The same is true for the next ‘element’ of interactional psychology. Indication 
that personality is one of three elements in an approach involving 
personality-environmental interaction demands specification of personality. Endler & 
Magnusson (1976) refer generally to cognitive and emotional factors in personality, but no 
specification is given. Mischel (1973), in switching from his previous position that there are 
no general personality characteristics, has also taken a similar position. That is, he refers 
to cognitive and emotional constructs of personality — but he provides little specification of 
what these are. In neither case is research introduced to empirically characterize the nature 
of these personality constructs. Mischel does utilize a concept similar to that of the basic 
behavioural repertoires of social behaviourism. He states that the individual learns ‘the 
potential to generate vast repertoires of organized behavior' and he introduces the term 
*cognitive and behavioral (social) competencies' (1973, pp. 265—267). Bandura until very 
recently had no conception that played the role of personality in social learning theory. He 
has recently, however, added this dimension to his theory. He calls the construct P to stand 
for ‘cognitive and other internal events that can affect perceptions and actions’ (Bandura, 
1978, p. 345). But he also gives very little specification of what these internal personality 
events are, how they come about, or how they have their effects. Bandura has also referred 
to the construct of 'self-efficacy', in a manner very much like the social behaviourism 
construct of the ‘self-concept’ and has only begun research on this construct. Thus, in this 
area, also, social learning theory and interactional psychology have only included a general 
philosophy, not a theory of personality. 


Interaction. Mischel (1973) referred to person-situation interaction. But he did not specify 
what the nature of the interaction was, what were the principles that were involved. Like 
Endler & Magnusson (1976), as has been indicated, interaction in this position has been 
specified only as a statistical artifact, as a separate contributor to the determination of 
behaviour — outside of the effects of the environment and personality. The same is true of 
Bandura. He includes the principle of interaction in his later works but provides little 
theoretical or experimental specification or justification. 

It is important to indicate that a major difference between social learning theory and 
interactional psychology, on the one hand, and social behaviourism, on the other, resides 
in the development of the central concepts and principles of the environment, the 
personality, and interaction. The difference in analysis will be indicated in the following 
sections. 


The behavioural analysis of social behaviourism 


The social behaviourism approach in this area may be schematized as shown in Fig. 1. As 
indicated, the individual learns a set of basic behavioural repertoires, which constitute his 
personality, through his contact with his environment (S,). The way he behaves at a later 
time (B) will be a function of the later situation (S,) as well as a function of his personality 
repertoires. In this sense there is an interaction, a direct interaction, since his behaviour will 
depend upon the situation (S,) but also upon his personality repertoires. Another type of 
interaction resides in the fact that the individual's behaviour (B) will in turn affect the 
environment and this will act back upon the individual in various ways, as will be 
indicated. 

First, however, the importance of analysis must be stressed. The schema indicated in 
Fig. 1 is merely an abstraction of a general approach — the major purpose of which is to 
characterize the approach (see Staats, 1975, p. 69). The conception serves as an heuristic 
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personality behaviour 
previous repertoires 
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situations 
S3 
present 
situation 
Figure 1 


theory, however, only to the extent that its elements are specified theoretically and 
experimentally. It should be emphasized that unlike social learning theory and interactional 
psychology, social behaviourism has included analysis as a major methodological 
characteristic from its beginning. The first description of behavioural analysis, so much a 
part of behavioural psychology today, was made in the first general social behaviourism 
work (Staats, 1963, pp. 459-460; 1964, pp. 121—144). It is thus relevant at this point to 
indicate something about the role of social behaviourism’s behavioural analysis in the 
context of the present considerations, in the next three sections. 


Analysis and the environment 


Social behaviourism recognizes the fact that a personality theory that deals with the 
environment is soundly based only to the extent to which it provides concepts by which to 
analyse and specify the environment and the principles by which it has its effects. The role 
of learning theory is to deal on a basic level with the characteristics of the environment and 
the principles by which it has its effects. From the social behaviourism view it is thus 
necessary to include a basic learning theory in a personality theory that treats of the 
environment. Thus, social behaviourism includes a new basic learning theory in its 
foundation (see Staats, 19686, 1970, 1975, ch. 2). 

It should be noted, however, that the environment enters the social behaviourism 
conception at two points (actually, at a continuing number of points), that is, at the point 
of the original learning of the basic behavioural repertoires, and in later situations. Thus, in 
this approach it is necessary to specify the environmental conditions associated with the 
learning of personality — see, for example, Staats (1968a, 1971a; Staats et al., 1970) for an 

, experimental and theoretical account of how the child learns his cognitive basic 
behavioural repertoires. 

'To understand human behaviour, however, it is necessary also to specify the situational 
environment in which the individual functions, S, in Fig. 1. Without such analysis 1ncorrect 
conclusions may be drawn; the active constituents of the present situation must be known. 
To illustrate, the classic LaPiere study (1934) was used by Mischel (1968) to substantiate 
his contention that there were no general personality characteristics across situations — by 
in this case showing the lack of consistency between verbal statements and actual 
behaviour. LaPiere found that in response to a written inquiry subjects indicated they 
would refuse restaurant service to Oriental customers, but when faced with the actual 
situation of an Oriental couple accompanied by a Caucasian they served these customers 
readily. Mischel's interpretation of this *personality inconsistency' was not warranted in 
reference to the data, however. The restaurant owner's negative attitude towards Orientals, 
let us say, would constitute a previously learned aspect of his emotional-motivational 
personality system. That would be an enduring, consistent aspect of personality. However, 
the attitude, although it would tend to elicit negative behaviours, would not be the only 
determinant of the individual's behaviour. In the presence of additional directive (incentive) 
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stimuli that would inhibit negative behaviours towards the Orientals — for example, the 
presence of the Caucasians, or the opportunity to make some money - the restaurant 
owner might behave in a positive manner. Social behaviourism’s behavioural interaction 
conception suggests that there are general personality processes across situations. However, 
to see that generality it is ordinarily necessary to analyse the elements of the situations in 
which the individual is behaving. It should be noted that it is also necessary to employ a 
learning theory in this analysis that recognizes the several basic functions that stimuli can 
have (that is, emotion eliciting, rewarding, and directive), as will be indicated. 


Analysis and personality 

The social behaviourism conception of personality has already been described in brief. The 
point that is relevant here concerns the analytic nature of the theory. The conception of 
personality, it should be stressed, was built from below. Take for example the realm of the 
cognitive aspects of personality now being generally stressed by social learning theory and 
interactional psychology. Social behaviourism from the beginning began to treat 
systematically the central area of language-cognitive development and function. It was the 
first behavioural approach to treat the personality functions of language as well as the 
learning aspects, and this has involved a lengthy series of experimental and theoretical 
works (see, for example, Staats, 1957a, b, 1961, 1963, 1968a, 1971a, b, 1972, 1973, 1975; 
Staats & Staats, 1957, 1958; Staats et al., 1962; Staats et al., 1964; Staats & Butterfield, 
1965; Staats et al., 1970; Staats & Hammond, 1972; Staats & Warren, 1974). 
Language-cognition consists of a number of parts (or repertoires). Without specifying what 
these are, how they are learned, and how they have their effects, a theory that says it 
includes cognition as a central element does so only on a general philosophical level. 

One other example will be given, that of the emotional aspects of personality — which are 
also prominently referred to in interactional psychology and social learning theory. Again, 
social behaviourism has built ‘from below’ a systematic experimental and theoretical 
treatment of the emotional-motivational aspects of personality. As has been indicated 
social behaviourism conceives of the individual's emotional-motivational system to consist 
of all the stimuli — social stimuli, work stimuli, sexual stimuli, recreational stimuli, values 
(as in religion or politics), food stimuli, aesthetic stimuli and so on, and various negative 
stimuli and events, as well as the words representing such stimuli — which elicit an 
emotional response in the individual (Staats, 19685, 1975). Different personality types, such 
as whether the individual is heterosexual or homosexual, a writer or a chemist, religious or 
unreligious, a music lover or a sports fan, extravert or introvert, masculine or feminine, and 
the like, prominently involve the nature of the emotional-motivational system. i 

Social behaviourism has specified the principles involved in the formation (learning) of 
the emotional-motivational system in basic study involving physiological emotional 
responses (Staats et al., 1962; Staats & Hammond, 1972) as well as the manner in which 
emotional responses are learned through language (Staats & Staats, 1957, 1958). The 
theory has also specified the basic principles by which the emotional-motivational system 
functions in affecting the individual's overt instrumental behaviour — that is, that 
emotion-eliciting stimuli serve (1) to condition emotional responses to new stimuli, (2) as 
reinforcers for instrumental behaviour, and (3) to elicit directly approach instrumental 
behaviours in the positive case and avoidance behaviours in the negative case (Staats, 

1968 b). Moreover, the motivational aspects of the emotional-motivational personality 
system have been specified in experiments, for example, that deprivation of a stimulus 
increases the intensity of the positive emotional response elicited by the stimulus, its 
reinforcement value, and the strength to which it will elicit approach instrumental 
behaviours for the individual (Staats & Hammond, 1972; Staats et al., 1972; Staats & 
Warren, 1974; Harms & Staats, 1978). 
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In addition, however, the personality functions of social behaviourism’s theory of the 
emotional-motivational system have been shown. Thus, Staats et al. (1973) conducted a 
series of three experiments to show how the individual's emotional-motivational 
personality system affects the way he learns new attitudes, how his personality system 
affects what will reward him, and how his personality system affects his overt behaviours. 
The Strong Vocational Interest Blank was employed to measure the individual’s 
emotional-motivational system in all three studies. Thus, in one study subjects were 
selected to have positive emotional responses in one area and negative emotional responses 
in another. Subjects with just the opposite emotional-motivational systems were selected. 
When these subjects were put in the same stimulus situation, their different 
emotional-motivational systems mediated different overt instrumental choice or preference 
behaviours. In another study not yet published (Staats & Warren, in mimeo) subjects with 
either masculine or feminine personality traits were selected. All subjects later were equally 
presented with the same emotion-conditioning procedure. Each conditioning item 
contained a word that was positive-emotion-eliciting for a masculine person and 
negative-emotion-eliciting for a feminine person, in one condition — and the reverse in 
another condition. It was found that the same items — presented to individuals with 
different personality systems with respect to sex interests — conditioned the subjects with 
different personalities in the opposite direction. Again, this study provided direct support for 
the stipulation of what personality is composed of, in this case the emotional-motivational 
aspects of personality. And the study showed how personality differences result in the 
individual having different experiences and in being conditioned differently than someone 
with a different personality, even when faced with the same environment. 

This line of work is only in its infancy. It is necessary that a number of studies be 
conducted which involve specifying what is meant by personality in terms of basic 
behavioural repertoires. On the basis of the analysis studies can then be devised to indicate 
how the personality is learned, and how it has its functions in determining what the person 
experiences, what he learns, how he behaves, and what he further becomes as a 
consequence. 


Analysis and interaction 


The third element in the theories being discussed is that of interaction. Interactional 
psychology has been criticized herein for the ambiguity of its interaction conception and for 
its lack of specification of the principles of direct behavioural interaction in contrast to the 
artifactual statistical interaction. The present section will further indicate the manner in 
which interaction principles are specified ın social behaviourism. 

Social behaviourism has systematically elaborated various types of interaction (Staats, 
1963, 1971a, 1975). It is important to see that these types of interaction pertain to the 
individual himself as well as to more complex social constellations. Moreover, as will be 
seen the same principles apply throughout the levels of theory development involved. 


Person—-environment interactions. One type of interaction is that which has been referred to 
already. The environment acts on behaviour (which is a dependent variable). But behaviour 
is also an independent variable which acts back upon the environment. The three stimulus 
functions of the environment (emotion-eliciting, rewarding, and directive or incentive 
functions) have already been described. 


Person—behaviour interactions. Introduction of the concept of the personality repertoire, and 
specification of the ways that the personality repertoires can affect the individual's later 
behaviour have been elaborated. These concerns fall within the general province of 
traditional personality theory. In this area lie the areas of individual differences in response 
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to the same situation, that is, the causative aspects of personality. The same principles of 
the three-function learning theory are employed to specify the manner in which the 
personality repertoires affect the individual’s own behaviour. An important area is that of 
the individual's language-cognitive repertoires. Thus, the individual's self-language — and 
its emotional properties — have been seen to be involved in reasoning and planning and 
adjusting to future events (Staats, 1963, 1968a, 1971a, 1975). The self-processes that 
provide for self-direction of all kinds involve this type of interaction. For this principle of 
interaction to be useful it is necessary to specify what the mediating self-process repertoires 
are (largely of a language nature), as well as how they affect the individual's behaviour. 

Person-behaviour interaction has already been shown experimentally in the previously 
cited experiments. Thus, for example, the individual's emotional response to (interest in) 
certain career areas was shown to mediate his selection of articles to read (Staats et al., 
1973). The subject's emotional personality characteristics mediated his overt preference 
behaviour, in a very explicit behavioural analysis. 


Person-person (social) interaction. Interactional psychology has neglected consideration of 
the possibility that the principles of interaction encompass also the concerns of how 
individuals interact — a traditional interest of social psychology. Social behaviourism in its 
goal of providing a very comprehensive theory treated this topic in its first general account 
(Staats, 1963) in a chapter entitled ‘Social interaction’. ‘As a division of the scientific study 
of behavior, social psychology might be considered to be concerned with the interactions of 
individuals in groups, both with the behavior of members of the group as a dependent 
variable, and with the behavior as an independent stimulus variable controlling the 
behavior of other members of the group’ (Staats, 1963, p. 321). 

Interactional psychology has adopted the principle that the person’s behaviour may 
affect the social environment, which may then in turn act back upon the individual. But it 
does not specify what is involved. Social behaviourism has treated extensively the manner 
in which each person can serve as an attitude eliciting stimulus for the other, as a 
reinforcing stimulus for the other, and as a directive stimulus for the other as well as how 
continuing interactions occur between individuals. The analysis has been extended to treat 
such social psychology topics as attractions, leadership, conformity, aggression, sex 
behaviour, communication and persuasion and social perception (Staats, 1963, 1964, 
19685, 1975). A large literature has been organized in this theoretical treatment. There is a 
large potential for theoretical and empirical elaboration of these interaction principles in 
further elaborating behavioural interaction principles and theory. 


Person-group and group-group interactions. Social behaviourism has indicated clearly that 
its learning theory and concepts of personality apply not only to individuals, but also to 
groups and institutions. Like the individual, groups were said to have a system of stimuli 
that (1) elicited attitudes in its members (the A function), (2) would have reinforcing 
properties for its members (the R function), and (3) would have directive (or incentive) 
properties for its members (the D function). Thus, ‘(1) the A-R-D system of the individual 
[his emotional-motivational system] is a prominent aspect of his personality, (2). . .social 
systems also have A-R-D characteristics, and (3). . .the two interact’ (Staats, 1975, p. 527). 
Social behaviourism has further described the ways in which the personality of the 
individual and the ‘personality’ of the social entity interact. In addition, social 
behaviourism has presented principles of group-group interaction, another potentially large 
arca of theoretical-experimental elaboration. 


As in small-group interactions, large groups can interact with each other...in which the actions 
of one group constitute conditioning experience for members of the other group. The latter, as a 
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consequence of the conditioning experience, then performs actions that in turn affect the 
conditioning experiences of the first group. This reciprocal interaction may continue over long 
periods (Staats, 1975, pp. 529-530). 


It will only be added here that each of these types of interactions calls for theoretical 
extension in treating various phenomena of human behaviour and for further experimental 
stipulation. 


The multi-level theory of social behaviourism 


The above description of the various aspects of interaction treated within social 
behaviourism exemplifies the multi-level theory construction strategy that is an essential 
characteristic of the approach. The three-function learning principles are developed on the 
basic level of learning theory. Then at a lower level the principles are employed to 
construct a theory of the emotional-motivational system for the individual. This theoretical 
body then serves to provide the principles for treating social interaction between 
individuals, In another level of theory development, the total theoretical structure is then 
extended to individual-group and then group-group interactions. Such a multi-level theory 
has various productive features not found in more specific treatments — an important one 
being the unification of presently disparate areas of psychology. It should be noted that a 
new type of theory construction is involved here. The traditional theory construction of 
behaviourism is a two-level type — basic learning principles, on the one hand, and 
applications to human behaviour on the other. Social behaviourism introduces the concept 
of a hierarchically organized theory of multi-levels (Staats, 1975). 

It should be noted that the present paper can indicate only a limited portion of the levels 
of theory important in social behaviourism. With the introduction of its new concept of 
personality and the principles of interaction, and the multi-level theory structure, social 
behaviourism is capable of dealing with topics hitherto outside of behavioural psychology 
(Fraisse, 1976). Thus, social behaviourism has presented the outline of a new theory of 
abnormal psychology (Staats, 1975, pp. 244—288), which can serve as the foundation for 
extensive development. The behaviour modification approach to clinical treatment has been 
restricted by its conceptual foundation to treating separate behavioural symptoms. Social 
behaviourism indicates new methods of behavioural treatment and indicates an approach 
by which to deal with general personality in addition to specific symptoms (Staats, 1975, 
pp. 291-337). The concept of personality and the interaction principles provide a new basis 
for a behavioural child psychology (Staats, 1975, pp. 339-381). Behaviour modification in 
education has been limited largely to the use of reinforcement to remediate behaviour 
problems. Social behaviourism also provides the basis for the analysis of personality 
repertoires central to educational psychology such as intelligence, reading, mathematical 
learning and so on (Staats, 1975, pp. 382-413). Social behaviourism originally suggested 
the need for behavioural assessment (Staats, 1963, pp. 508—509), but this field has 
developed in isolation from the field of psychological (personality) testing. Social 
behaviourism with its new concepts of personality provides 4 basis for new behaviourally 
oriented developments in personality test construction and personality research (Staats, 
1975, pp. 414-453). The same concepts allow a rapprochement with traditional humanistic 
approaches to human behaviour, providing a basis for dispelling the schism between 
radical behaviourism and more subjectively based traditional conceptions (Staats, 1975, pp. 
457—490), a characteristic that, it is suggested, will be central in the new generation of 
behaviourism. 

The important point here is that the concept of personality and the principles of 
interaction are developed successively throughout these various levels of treatment, 
providing the basis for great generality and unity. 
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Preparadigmatic separatism and its drawbacks 


Finally, it has been suggested that psychology is in a preparadigmatic state TEM 1968 b, 
1970, 1975; Yates, 1975), unlike the state of the more advanced sciences (Kuhn, 1962). In 
the present view the field of psychology today is divided and separated, especially in the 
United States. There are division separations, separations by journals, separation by 
theories and problem areas, separations by student training specialities. Moreover, the 
preparadigmatic separatism does not prepare today's psychologists to be aware of what a 
unified paradigm might be in psychology. Novel developments are expected to arise from 
specialized treatments, by people identified as specialists. No one expects a general theory 
that spans the various areas to provide a heuristic basis for considering special fields and 
problem areas. There is a firm disbelief in the possibility of general, comprehensive, 
unifying theory (see Shaw, 1972). This preparadigmatic separatism, it is suggested, leads to 
various problems in our science — one of them being the existence as separate entities of 
works and lines of work that should be intimately related. The present case is a good 
example. 

Social behaviourism has been offered as a general, paradigmatic theory, perhaps the first. 
It is not a descriptive theory. It aims to be heuristic, in ways that have general significance, 
but also down to the level of specific research, and theory extension to new problems. The 
methods of normal science would assess this theory for its heuristic value, in the various 
specialty areas it treats. It is suggested that the preparadigmatic nature of psychology has 
interfered with the normal process in the present case (and in others not dealt with here). 
The separatism of behavioural interaction theory from interactional psychology began 
when Bandura (1969) included interaction principles in the social learning theory approach 
without relating the principles to social behaviourism. Later, Mischel's (1973) revision of 
his previous views to include the concepts of personality and person-environment 
interaction did not relate itself to the behavioural interaction conception (Staats, 1971 a). 
Mischel's article was in the specialized area of personality and it became the focal source 
for interactional psychology's consideration of interaction principles, thus sidetracking any 
later relationships of interactional psychology to social behaviourism. 

It is inefficacious in science when highly related endeavours are separated. The author 
has recently completed a book manuscript dealing with the drawbacks to the 
preparadigmatic nature of psychology. In the present case of separatism it can be seen that 
interactional psychology has missed a number of elements that might have been productive 
for its development. Moreover, social behaviourism now contains much greater 
development of the principles of interaction and related topics than are available in 
interactional psychology. It is the case that interactional psychology and social learning 
theory in general have already shown interest in developing along the lines already 
developed in social behaviourism. It would be a loss in science to have to redevelop 
elements that already exist. Centrally, however, social behaviourism has related the more 
fully developed interaction and personality principles to abnormal psychology, clinical 
psychology, child psychology, psychometrics and so on. Interactional psychology has not 
yet done this and there is an extensive foundation in social behaviourism that psychologists 
with interest in interactional psychology would find to be heuristic. The principles of 
interaction and personality have much greater significance and potential than has been 
presented in interactional psychology, and the more comprehensive theory should be made 
available to those interested in the principles. 

The preparadigmatic Zeitgeist, thus, is presently operating as an obstacle to integration 
and unification. If psychology is ever to develop a theory of a very general, paradigmatic 
nature this Zeitgeist must be replaced by methodological conditions that are conducive to 
cross-area theories and paradigms when they are presented and in fact that encourage the 
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development of such efforts. For a paradigm to be evaluated according to the methods of 
normal science, its subtheories in the various special areas must receive the same treatment 
as would a special area theory. A focus of the present paper is to influence change in this 
aspect of our preparadigmatic separatism. Psychology will not break out of its 
preparadigmatic state, recognized as a primitive state of development (Kuhn, 1962) until 
such unification is actively sought. Until the preparadigmatic characteristic is recognized it 
will continue to present an obstacle to the integration and unification of knowledge, a 


special interest of the present Journal. 
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À preliminary comparison of Cantonese and American-English as taste 
languages 


Michael O'Mahony and Teresa Tsang 





Groups of monolingual American-English speaking Americans of Chinese descent were compared 
with Cantonese/American-English bilingual Chinese living in America to examine their taste 
descriptions for a set of aqueous solutions. Cantonese, unlike other languages, did not appear to 
differ greatly from American-English in its general taste descriptive strategy and depth of vocabulary, 
although Cantonese speakers had a tendency to use ‘glutamic’ as a descriptive term for monosodium 
glutamate. 





Although English speakers learn a fairly broad language for colour description at 
pre-school age the same is not true for taste description (O'Mahony et al., 1978). The 
absence of a satisfactory language for taste appears to continue to adulthood so 
psychophysicists generally supply the ‘primary’ descriptive categories: ‘sweet’, ‘sour’, 
‘salty’, ‘bitter’ (Hoover, 1956; Harper et al., 1966; Robinson, 1970) while ‘tasteless’ 1s 
generally added (Bartoshuk et al., 1964; Dzendolet & Meiselman, 1967; Meiselman & 
Dzendolet, 1967). To cover further possibilities, ‘indefinite’ (Bartoshuk, 1968), ‘detection’ 
(O'Mahony, 1973) or ‘unidentified’ (Gregson, 1966) categories have been used. However, 
without supplied categories subjects tend to use a wide variety of novel terms while still 
relying heavily on the primaries (O'Mahony & Thompson, 1977). 

It is thus worth studying other languages to see whether they have taste naming 
strategies different from English and also to highlight any assumptions inherent in the use 
of English itself. Early comparison of taste description between languages (Chamberlain, 
1903; Myers, 1904) was concerned primarily with the evolution and confusion of taste 
words in so-called ‘primitive’ cultures. More recently O'Mahony & Muhiudeen (1976; 
1977) noted the spontaneous use of the modifying comparative phrase for taste description 
in Malay, a technique which would be a useful adjunct in flavour profiling (Cairncross & 
Sjöström, 1950; Sjöström et al., 1957). 

Cantonese is a language that has not been formally investigated for its taste descriptive 
behaviour; certainly it has exact translations for the primary taste categories: ‘sweet’, 
‘sour’, ‘salty’, ‘bitter’ and ‘tasteless’. These are ‘teem’, ‘sûn’, ‘harm’, ‘foo’ and ‘mooit 
yow mae doe’, respectively. (See Glossary, p. 226, for explanation of the phonetic English 
and correct Chinese characters). 

Although the correct place for an extensive study of Cantonese is China, advantage was 
taken of access to a group of 10 students who were bilingual in Cantonese and English as 
spoken in California (hereafter called American-English), to make a preliminary 
comparison of the two languages. 


Method 


Subjects were students at the University of California, Davis. There were 10 Cantonese/American- 
English ‘bilinguals’ (4F, 20-24 yrs; 6M, 21-27 yrs); these were students from Hong Kong who were 
studying at the University of California in American-English but whose first language was standard 
Cantonese. All had been in USA for at least 3 years. There was also a control group of 15 
American-English monolinguals (SF, 21-22 yrs; 10M, 21-23 yrs) who were native American of 
Chinese descent with little or no comprehension of Cantonese and inability to converse in the 
language. The latter group offers some measure of control for diet because both groups in America 
eat a diet which has a high proportion of Chinese food. The definition of bilingualism and 
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monolingualism is complex because they are more parts of a Cantonese-English continuum than 
categories. Here the terms will be used as defined by the groups. 

Subjects were required to sip 10 ml samples of purified water and aqueous solutions of 300 mm 
sodium chloride, 300 mm sucrose, 300 mm sodium benzoate, 300 mM monosodium glutamate, 
(MSG), 20 mM citric acid, 20 mM sodium carbonate and 200 uM quinine sulphate. The stimuli were 
sipped, held in the mouth approximately 3-5 s and expectorated; the experimenter demonstrated the 
technique to the subjects. Stimuli were presented in random order at room temperature (18-22 °C) in 
1 oz unwaxed paper cups, with an approximately 40 ml purified water rinse followed by 30 s rest 
between tastings to reduce sensitivity drift caused by adaptation to residual stimuli from prior tastings 
(O'Mahony, 1974; O'Mahony & Godman, 1974). 

Subjects were required to describe (not identify), in their own words, the taste of the stimuli; care 
was taken during the instructions not to mention any taste words so that the descriptions could come 
from the subject’s own descriptive repertoire without any bias due to suggestion. Subjects were 
allowed to use as many descriptive terms per stimulus as they wished. 

Stimuli were Reagent grade (sodium benzoate was U.S.P.) chemicals made up in water purified 
using a Milli-Q3 and Milli-Q2 system in series involving filtration and treatment by reverse osmosis, 
ion exchange and activated charcoal (Millipore Corp., Bedford, Massachusetts, USA). 

The monolinguals performed the experiment in American-English while the bilinguals performed 
the experiment in both language conditions. The two conditions were spaced by a 3-week interval 
and the order of the American-English and the Cantonese conditions was counterbalanced over 
subjects. In both conditions, all conversation, including instruction and greeting, was in the 
appropriate language. 

In their initial condition, whether it be Cantonese or American-English, subjects were asked not 
only to describe the taste but also to state whether 1t was the same or different from prior stimuli. 
The demand characteristics of the procedure were such that subjects did not expect all stimuli to be 
the same. Only subjects who could distinguish all stimuli as different continued in the experiment. 


Results 


Taste descriptive words for each stimulus were noted, ignoring hedonic (‘nice’, ‘nasty’), 
qualifying (' very', slightly") and temporal descriptions. The descriptions for the 15 
American-English monolinguals and the 10 Cantonese/American-English bilinguals are 
given in Table 1; Cantonese responses are listed under their English equivalents. 
Descriptions are scored as being single primary descriptions (e.g. ‘salty’, 'sweet"), 
combinations — primaries combined with other terms (e.g. ‘bitter-rancid’), ‘glutamic’, 
‘tasteless’ or their combinations, or novel terms involving none of the aforementioned (e.g. 
‘alcohol’, *meaty"). When a description like *sour-salty' was given, it was scored both as a 
sour combination and a salty combination; thus, the numbers of categories of description 
in the body of the table may exceed the total number of descriptions given. 

The actual descriptions used are worth noting (phonetically spelled Cantonese names will 
be given in brackets). Although the *primary' taste descriptions were common, the use of 
*novel' descriptions in English confirm earlier findings (O'Mahony, 1973; O'Mahony & 
Godman, 1973; O'Mahony & Thompson, 1976, 1977). There was no tendency to use the 
modifying comparative phrase in American-English or Cantonese, unlike Malay 
(O'Mahony & Muhiudeen, 1976, 1977). 

Sodium chloride was generally described in both languages as ‘salty’ (harm) or its 
combination; here ‘salt water’ (yim soyeu!), not being a strict combination, was scored as 
a novel term. Sucrose was generally described in both languages as 'sweet' (teem) or 
*sugar' (English only), while the reported tendency to combine 'sweet' with other terms 
(O'Mahony, 1976; O'Mahony & Thompson, 1977) was noted, with frequent use of 
*sweet-sugar water’ (teem tong soyeu!) in both languages. 

As with O’Mahony & Thompson’s (1977) English study, ‘bitter’ (foo) and its 
combinations (e.g. ‘bitter-rancid’, 'bitter-base") were the usual response for quinine 
sulphate along with some novel terms. Citric acid was generally described as ‘sour’ (sin) or 
1ts combinations (e.g. ‘sour-lemon’) as noted previously in English (O'Mahony, 1976; 
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O'Mahony, Hobson, Garvey, Davies & Birt, 1976; O'Mahony & Thompson, 1977). 
Although not in Cantonese, ‘lemon’ was sometimes used as a descriptive term, confirming 
earlier studies (Richter & Campbell, 1940; O’Mahony & Stevens, 1975; O’Mahony, 1976; 
O’Mahony, Hobson, Garvey, Davies & Birt, 1976). Water was generally described as 
‘tasteless’ (mooit yow mae doe) as seen before (O’Mahony & Thompson, 1977) although 
the novel term, ‘water’ (soyeu) was also common. There was only a slight tendency for 
Anderson’s (1959) reported ‘bitter’ descriptions to occur. 

As before (O'Mahony & Thompson, 1977), MSG was generally described as ‘salty’ 
in combination with other primaries although the bilinguals tended to use ‘glutamic’ 
(maydzing) or its combinations; this tendency was almost absent in monolinguals. Novel 
terms (fish, beef broth-Raman noodle) were also noted. Sodium carbonate was generally 
described as ‘bitter’ (foo) or its combinations (e.g. ‘bitter-sweet’, ‘bitter—bland’) 
confirming earlier studies (O'Mahony & Thompson, 1976, 1977). Although novel 
descriptions (‘chalk’, ‘burnt extract") occurred, there were no traditional reports of 
‘alkaline’ (Moncrieff, 1967). Sodium benzoate tended to be described as ‘sweet’ or ‘bitter’ 
in both languages while the combination ‘bitter-sweet’ occurred for both groups but only 
in the English condition. O'Mahony & Thompson (1977) had found a predominance of 
‘sweet’ descriptions. 


Discussion 
While few firm conclusions can be drawn from a preliminary study, some indications for 
future research can be noted. 

The descriptive strategies used by both groups in the American-English condition and by 
the bilingual group in the Cantonese condition were the same as those noted in English 
(O'Mahony, 1973, 1976; O'Mahony & Godman, 1973; O'Mahony & Stevens, 1975; 
O'Mahony, Hobson, Garvey, Davies & Birt, 1976; O'Mahony, Kingsley, Harji & Davies, 
1976; O'Mahony & Thompson, 1976, 1977). The four *primary' English categories and 
their Cantonese equivalents ‘sweet’ (teem), ‘sour’ (sûn), ‘salty’ (harm), ‘bitter’ (foo) 
along with ‘tasteless’ (mooit yow mae doe) were frequently used spontaneously although 
they were often modified by combination with other terms. The appropriate 'primariés' 
were used for sucrose (sweet), citric acid (sour), sodium chloride (salty) and quinine 
sulphate (bitter), while the usual trend of a spreading of descriptions across ‘primaries’ was 
noted for MSG, sodium carbonate and sodium benzoate, along with an increase in novel 
terms. This may be because of difficulties in applying inappropriate ‘primary’ descriptions 
to ‘non-primary’ stimuli. 

There appeared to be little of the previously reported sour—bitter confusion (Meiselman 
& Dzendolet, 1967; Robinson, 1970; Gregson & Baker, 1973; McAuliffe & Meiselman, 
1974), only a tendency for citric acid to elicit sour—bitter descriptions. This absence of the 
reported confusion was probably due to the small number of subjects available. 

An interesting effect was a tendency by Cantonese speakers (in both languages) to use 
‘glutamic’ (‘maydzing’, ‘like MSG’, ‘like Ajinomoto’) as a description similar to the 
primaries; American-English monolinguals tended to rely more heavily on ‘salty’. The use 
of ‘glutamic’ or ‘like MSG’ (Ikeda, 1912; Kuninaka, 1965) is hardly surprising among 
people who use it so commonly in food preparation, for the same reasons that ‘salty’ is 
hardly surprising among Europeans. What is perhaps surprising is its rare use by 
monolinguals who eat Chinese food; perhaps the English language habits were too strong 
to be affected by their dietary habits. Perhaps it is this aspect of the preliminary study 
which deserves further attention. It would be interesting to speculate whether ‘glutamic’ 
would be a ‘primary’ taste had the early taste research been performed in Asia rather than 
Europe. 
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English Cantonese glossary 





English Cantonese 
Bitter * 
Glutamic HH 
Salt water Sx 
Salty mh 
Sour &. 
Sugar A 
Sugar water Aw 
Sweet 4 
Tasteless MAHA, 
Water A. 


Key for English phonetic spelling: 
All Cantonese words are monosyllabic. 


English phonetic spelling 





foo 

maydzing 

yim soyeu! 

harm 

sün 

tong 

tong soyeu! 

teem 

mooit yow mae doe 
soyeu! 


& is pronounced with the French ‘u’ vowel sound. 
eu is pronounced as a cross between ‘oo’ and ‘i’. 
! denotes an abrupt ending, similar to but not as strong as a glottal stop. 
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Current issues in the psychology of reasoning 
J. St B. T. Evans 





Recent studies of deductive reasoning are reviewed with respect to three questions: (i) Do people 
reason logically? (ii) Is reasoning introspectible? (iii) Is reasoning sequential? It is argued that the 
evidence of reasoning experiments suggests a negative answer to all three questions. This conclusion 
is interesting, since the last two questions at least might be answered affirmatively by common sense, 
and affirmative answers would be more consistent with the assumptions of many psychologists in 
related fields. The question is raised, however, as to whether experimental studies have good external 
validity for the measurement of ‘reasoning’ as we would generally understand the term. 








In the post-war period, experimenta! research into thinking has been somewhat limited. In 
European psychology the main emphasis has been on studying thought processes in 
children, and most of this work has been interpreted in a Piagetian framework. In the 
United States a tradition of work on problem-solving and concept identification has been 
maintained with close links to learning theory. The emergence of ‘cognitive psychology’ on 
both continents has boosted research into perception, memory and language, but seems to 
have stimulated little additional study of thought processes. 

Over the past decade or so, however, a field of research which has grown steadily, if not 
spectacularly, is the study of deductive reasoning. The field is identifiable more by usage of 
certain paradigms than by association with any particular movement or theoretical 
approach. Indeed, the diversity of approaches represented in this area is one of its 
fascinations. The present article is concerned with an examination of research in this field. 
It is not, however, intended as a comprehensive review. The aim is to focus on certain 
issues arising from this work which, in my opinion, have some general theoretical interest 
in the psychology of thinking. Specifically, we shall examine three questions: 


(1) Do people reason logically? 
(2) Is reasoning introspectible? 
(3) Is reasoning sequential? 


In answering these questions we are, of course, giving a partial answer to the set of 
questions we could ask by replacing the word ‘reasoning’ by ‘thinking’ in the above. First, 
however, some general introduction to the field is in order. 

In reasoning experiments, subjects are asked to solve problems which have a well-defined 
structure in a system of formal logic. For example, a subject may be given the premises of a 
logical argument and asked what conclusion, if any, follows. Alternatively, he may be 
asked to consider the situations under which a logical proposition or rule might be true or 
false. An obvious common feature of such problems is that a normative solution is 
available — that required by the logical system. Subjects’ responses may then be measured 
as ‘correct’ or ‘incorrect’ against such a criterion. Our first question ‘Do people reason 
logically?' is hence an obvious one to ask, and accounts for the commencement of work in 
this field. The other two questions arose as an unforeseen result of investigating the first. 

Research has been based principally on three systems of logic: Firstly, Aristotelean or 
syllogistic logic which employs quantified statements (concerning class relationships). These 
are conventionally presented as arguments with two premises and a conclusion, e.g.: 
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All As are Bs 
Some Bs are Cs 


(Therefore) Some As are Cs 

Such arguments are only valid if the conclusion necessarily follows from the premises, 
otherwise they are fallacious. The above example is, in fact, a persuasive fallacy (if in 
doubt, let A = Men, B = People and C = Mothers). The syllogism would be valid if the 
first premise read, ‘All Bs are As’. 

The second system concerns reasoning with transitive relationships, i.e. placing objects in 
relation to each other on some kind of scale. Such arguments can be presented in a 
syllogistic form, e.g. 

John is taller than Paul 
Peter is shorter than Paul 








(Therefore) John is taller than Peter 


Presented in this form transitive inferences are sometimes known as linear syllogisms or 
three-term series problems. 

The third system, upon which much recent work has been based, is that of propositional 
logic. This can be regarded as the ‘standard’ logic of modern philosophy which has 
essentially superseded Aristotelean logic. In this form of logic theorems may be proved by 
alternatively applying rules of inference, or by use of truth table analysis. By definition, an 
argument is logically valid if a false conclusion cannot be derived from assumptions all of 
which are true. In the truth table method, all permutations of possible truth values of the 
propositions (p,q...) are considered, and if no such invalidating case is found, the 
argument is declared valid. The logical equivalence of reasoning by inference rules or truth 
tables does not, of course, entail any necessary psychological equivalence of a subject's 
performance of the corresponding tasks. Psychological work in this area has focused 
principally on people's reasoning with conditional sentences of the form If p then q and to 
a lesser extent on disjunctive sentences of the form Either p or q. 

For detailed, recent reviews of research into these three types of deductive reasoning, the 
reader is referred to Revlis (1975) — syllogistic; Johnson-Laird (1972) — transitive; and Evans 
(1978) — propositional. 


Questions 

(I) Do people reason logically? 

The question of whether people think logically relates to the general and ancient question 
of whether man is rational. It could be argued on a priori grounds that the answer to this 
question must be negative, since man is demonstrably irrational. It can also be argued on 
philosophical grounds that systems of formal logic are not intended as hypotheses about 
the nature of thought, but as techniques for answering the correctness of argument. We 
would not, for example, pose as a general question ‘Do people think mathematically’? 
People either are or are not able to apply a set of mathematical rules, according to their 
training. 

I have been aware for some time of the danger of interpreting behaviour on reasoning 
tasks in terms of formal logic (see for example, Evans, 1972a). Suppose that you wish to 
know whether or not a subject has the ‘competence’ to make a reductio ad absurdum 
inference. You could give him a problem that is solvable by this logical device and see 
whether or not he solves it. With this attitude you would be able inevitably to claim that 
he either ‘possesses’ this logical rule or he does not. 

The problem is that his behaviour may actually be determined by factors irrelevant to 
the logic of the problem. 
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It is important to distinguish the question of rationality or logicality at the level of (i) the 
process or strategy, and (ii) the product of that process. The observed behaviour may be 
classified as logical or illogical according to whether the response fits the dictates of formal 
logic. This does not necessarily tell us whether the underlying process is logical or not. For 
example, on some problems a non-logical response bias may lead fortuitously to a correct 
answer. Equally, a subject may follow a logical strategy but due to misunderstanding the 
problem in some way still arrives at an erroneous conclusion. 

We are concerned here with the question of whether or not the process rather than the 
product of reasoning is logical. The alternative positions, as Evans (1972a) pointed out, 
are of two kinds: Illogical and non-logical. For example, Wason's (1966) hypothesis that 
reasoning errors result from a verification bias, assumes that subjects respond to the logical 
structure but with an erroneous strategy. This I call illogical reasoning. A subject 
responding to some feature of the task logical structure, is doing what I call non-logical 
reasoning, though others might prefer to say that he is not reasoning at all. Response bias 
effects come into this latter category. 

Currently, the most popular view in the literature is that the reasoning process does 
accord with principles of logic. This rationalist school of thought is led by Mary Henle. Her 
views have been so influential in the USA that two recent collections of papers, edited by 
Falmagne (1975) and Revlin & Mayer (1978) have been effectively dedicated to her cause. 
In a foreword to the more recent of these she states, ‘I have never found errors which could 
ambiguously be attributed to faulty reasoning'. 

Although I do not believe that the question ‘Do people reason logically?’ can be 
answered with ecological validity by looking at performance on reasoning tasks, I am 
forced by the Henle disciples to consider it in this context. What we are really assessing is 
the evidence for the view that naive subjects in reasoning experiments apply the rules of 
formal logic in order to solve them. One might suppose that the mere fact that nearly all 
reasoning experiments from those of Sells (1936) onwards show that subjects' responses 
deviate frequently from the logically correct choices, would necessitate negative answer. 
The rationalists, however, are not so easily discouraged. The reason for this lies in a 
problem most clearly stated by Smedslund (1970). One cannot tell whether or not a subject 
has reasoned logically unless you assume that he has interpreted the premises of the 
argument correctly, and vice versa. 

It was this ambiguity that permitted Henle (1962) to present her ‘logical’ theory. Her 
paper is, in fact, based upon a qualitative and selective analysis of protocols, given by 
subjects who were asked to perform syllogistic reasoning on problems with thematic 
content. Her conclusion is that subjects who 'accept the logical task' make deductions in a 
logical manner. Logical errors arise because subjects misinterpret the premises presented. 
Specifically, they may drop, add to or alter the premises given. As mentioned earlier the 
Henle hypothesis has enjoyed considerable popularity in the recent literature both with 
respect to syllogistic and propositional reasoning. The appeal is simple. If Henle is correct, 
then any deviant solution is due to misinterpretation of the premises. Hence we can ask 
*how could the subject have interpreted the problem such that by logical inference he 
would arrive at the conclusion stated?' Thus, in effect, we can test psycholinguistic 
hypotheses about certain sentences by asking subjects to reason with them. We will briefly 
examine the evidence for the Henle hypothesis in the context of syllogistic and 
propositional reasoning tasks. 

The earliest theory of syllogistic reasoning was that of Woodworth & Sells (1935) who 
predicted an ‘atmosphere effect’ in that subjects would prefer to accept a conclusion where 
that conclusion was congruent with the mood (e.g. negativity) of the premises. À second 
principle (caution) predicted that negative conclusions would be preferred to affirmative 
ones, and particular conclusions to universals. Since the figure of the syllogism (and hence 
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its logical structure) is ignored by both principles this is an example of what I have called a 
‘non-logical’ theory of reasoning (Evans, 1972a). An alternative ‘illogical’ theory is that of 
Chapman & Chapman (1959) who propose that subjects respond to the logical structure 
but make invalid conversions and probabilistic inferences. It has been claimed by Begg & 
Denny (1969) that both theories fit the data well and are not empirically distinguishable. 

A more recent model (Revlis, 1975) is more Henle-like in that it proposes that illicit 
conversions of the sort discussed by Chapman & Chapman occur at a representational 
stage, prior to making deductions. However, Revlis cannot account for all performance in 
this manner and is forced to postulate an additional guessing mechanism which is 
influenced by non-logical response biases. 

A recent paper by Johnson-Laird & Steadman (1978) has, however, complicated matters 
considerably. They discovered a strong and previously unknown effect of ‘figural bias’ on 
syllogistic reasoning. This, they claim cannot be accounted for by any previous theory. The 
model that they present (post hoc) is not compatible with the Henle position because at 
least some errors arise from mistakes in deducing information from the premises, rather 
than simply at the stage of representing the information contained in the premises. All in 
all, then, the evidence does not support formal logic as a competence model in syllogistic 
reasoning. : 

A detailed critical evaluation of the Henle hypothesis in propositional reasoning has been 
made in a previous article (Evans, 1978). The arguments given there will be briefly 
summarized. It was argued that if Henle was correct then subjects should reason 
consistently to some particular interpretation of a given sentence such as a conditional rule. 
The evidence is against this in that subjects reasoning on some tasks behave as if they 
interpreted the conditional as an implication, whilst on other tasks they appear to interpret 
it as equivalence. Behaviour on the Wason selection task (see Wason & Johnson-Laird, 
1972) is consistent with neither interpretation. There is even evidence that subjects are 
inconsistent on the same task (e.g. Taplin & Staudenmayer, 1973). 

Further problems for the Henle hypothesis arise from the discovery of non-logical 
response bias effects on all paradigms considered. Two examples will be considered here 
(i) negative conclusion bias and (ii) matching response bias. The former applies to tasks 
where the subject is asked to draw simple inferences about the truth or falsity of 
propositions, For example, with a conditional rule, Zf p then q there are four inferences 
which might be drawn: 


Given Conclude 
Modus ponens (MP) p q 
Denial of the antecedent (DA) not p not q 
Affirmation of the consequent (AC) q p 
Modus tollens (MT) notq not p 


Of these four inferences, only MP and MT are valid if the rule is taken to express material 
implication. All four inferences are valid if the rule is taken as a biconditional or 
equivalence. In experimental studies, subjects tend to give MP most frequently, then MT, 
and a fair number also make DA and AC. However, this pattern of responding is 
considerably altered by the introduction of negative components as independent studies by 
Roberge and myself have demonstrated. 

Roberge (1971) observed that subjects make more logical errors when the antecedent of 
the rule is negated (as, for example in the rule Jf not p then q). Y noted, more specifically, 
that fewer MT and more AC inferences are made when the antecedent is negative (Evans, 
1972b). The description of ‘errors’ depends on assuming that the implication interpretation 
is ‘correct’. There are two possible interpretations of these results. One, along Henle lines, 
would be the supposition that the introduction of the negative alters the subject's 
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interpretation of the rule. There are difficulties with this, however, as either of the plausible 
interpretations of the conditional (implication, equivalence) requires the MT inference to be 
made. The second alternative is a response bias in favour of negative conclusions. The 
more often accepted inferences - MT with an affirmative antecedent and AC with a 
negative antecedent both involve making negative rather than affirmative conclusions. Note 
that this is in line with the ‘caution’ principle of Woodworth & Sells. If the response bias 
hypothesis is correct, then the same should apply on different inferences. Evidence 
consistent with this hypothesis has been found on reductio ad absurdum reasoning (Evans, 
1972c) and denial of the antecedent (Evans, 1977a). In a recent study (Pollard & Evans, in 
preparation) subjects’ reasoning about the consequent of the rule was accountable solely in 
terms of this bias. The effect is, however, only partially supported by work on disjunctives 
(Roberge, 1976). 

Matching response bias was originally observed in an experiment where subjects were 
asked to construct verifying and falsifying cases of conditional rules in which the presence 
and absence of negative components was varied (Evans, 1972d). The effect is simply a 
preference for subjects to construct cases which match the values named in the rules 
irrespective of the presence of negative components. For example, the preferred falsification 
of the rule ‘If there is an A there is not a 3’ would be A3. The rule ‘If there is nota D 
there is a 7’ would, however, be falsified, usually by D7, which also matches but is logically 
different (an equivalent construction to the first rule would be, say, G3). Evans & Lynch 
(1973) used the matching bias hypothesis to predict behaviour on the Wason selection task, 
and it is now generally accepted as a more fundamental explanation than the previous 
hypothesis of ‘verification bias’ (see Johnson-Laird & Wason, 1977, pp. 151-157). Again, a 
Henle-type explanation in terms of the negative changing the interpretation of the rule does 
not seem tenable, since some responses consistent with matching could not be predicted by 
any kind of interpretational change. Matching bias generalizes to rules of the form p only if 
q but not to disjunctive rules (either p or q). This discrepancy was first pointed out by Van 
Duyne (1973) but on inadequate evidence. Recent studies in my own laboratory (Evans & 
Newstead, in preparation) have, however, confirmed the absence of matching bias with 
disjunctive rules. Clearly the effect is dependent upon the linguistic content of 
conditionals — within that context, however, its effect still appears to be non-logical. 

The evidence is, then, against the extreme form of the rationalist position, so far as 
performance in reasoning experiments is concerned. This does not mean, of course, that 
subjects do not and cannot reason logically under some circumstances. Before leaving this 
issue we must consider an alternative, weaker form of the rationalist hypothesis that has 
recently emerged. 

Various psychologists have borrowed the linguistic distinction of 
competence/performance and applied it to reasoning. Such theorists could dismiss the 
evidence for non-logical response biases as ‘performance factors’ while still maintaining 
that subjects have an underlying logical competence which manifests itself on these tasks. 
Those who have pursued this line have tended to regard formal logic as unsuitable for 
describing people’s logical competence, because, for example, it requires us to accept 
counter-intuitive arguments (see Wason & Johnson-Laird, 1972; Osherson, 1975; Braine, 
1978). 

A possible theoretical development lies in an attempt to formulate some kind of ‘natural 
logic’, appropriate to language and the real world, to describe this competence. For 
examples of recent attempts along these lines, the reader is referred to Braine (1978) on 
propositional reasoning, and Johnson-Laird & Steadman (1978) on syllogistic reasoning. 
The difficulty of this sort of approach clearly lies in its testability. A theory of competence 
alone is not a scientific theory in the Popperian sense. In order to generate predictions 
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which could be falsified empirically, a reasoning theory must also embody performance 
factors and a device for generating the overall behavioural effect to be observed. Such an 
attempt has been made by Revlis (1975) but, as we have noted, his model has some new 
data to explain. Braine's paper is quite unsatisfactory in this respect. The competence 
model is precisely and explicitly presented, but its application in predicting performance is 
nowhere made clear. Furthermore, he fails to mention much of the evidence of performance 
factors in the kinds of task to which his theory applies. Johnson-Laird & Steadman (1978) 
do somewhat better in predicting, for example, that operations requiring an extra stage of 
processing will be more prone to error. They do not, however, give an explicit set of rules 
for generating predictions, and the fit that they obtain is essentially of a post hoc nature. 

The studies reviewed in this section do not support the Henle hypothesis, that subjects 
follow a logical strategy when solving reasoning problems. The ‘competence model’ 
approach is in its infancy and has got to face up to, let alone overcome, the problem of 
testability. None of these studies seem to tell us much about the nature of the thought 
processes involved. One possible means of obtaining such an understanding might be 
simply to ask the subject what strategy he adopts on the task. But is such a method valid? 
We will look at this question in some detail. 


(2) Is reasoning introspectible? 


The question of the validity of introspective data is an ancient one in philosophy and 
psychology, but has yet to be satisfactorily resolved. Experiments into controlled 
introspection around the turn of the century by Galton (1879) and the Wurzburg School 
(see Humphrey, 1951) cast serious doubts on the usefulness of the method, and Watson's 
behaviourism sought to banish ‘mentalism’ and its associated introspective methodologies. 
Introspection, however, is alive and well and rears its head in numerous articles in the field 
of cognitive psychology. It is quite common in the memory literature, for example, to find 
authors reinterpreting their results in the light of reported ‘strategies’ of their subjects (e.g. 
Hitch & Baddeley, 1976). In fact, the status of introspective data is the subject of current 
controversy in both cognitive and social psychology. Thus, for example, Pylyshyn (1973) 
has attacked the concept of imagery as mentalistic, while Kosslyn & Pomeranz (1977), in 
reply, have defended the use of introspective data as a method of aiding the interpretation 
of behavioural measures. Similarly, Nisbett & Wilson (1977), in a review of social 
psychological literature, propose that subjects do not have access to their thought 
processes, a conclusion which is disputed by Smith & Miller (1978). 

There is no intention here to dispute the notion that subjects have mental experiences or 
that they are able to talk about some of them. Such descriptions, which I will term 
phenomenal reports, may have scientific value. Take, for example, the phenomenon of 
subjective colour. When a certain type of black and white pattern is rotated at a critical 
speed, people report seeing coloured rings. This, in turn, leads psychologists to consider 
whether or not the patterns are interfering differentially with the operation of the 
mechanisms in the brain whose activity is normally correlated with phenomenal reports of 
the colours in question. Quite different problems arise, however, when a psychologist 
elicits a reported strategy which he infers to have mediated the subject's behaviour. This 
technique differs from the phenomenal report in that the subject is asked to describe not 
what he experiences, but how or why he performed a particular cognitive task. 

The use of a strategy report rests upon the mentalistic assumption that the subject is 
able to observe and report the mental processes which underlie his behaviour. However, 
the basic assumption that mental processes are somehow separate from or prior to 
behaviour has been strongly challenged in philosophy. Ryle (1949) and Wittgenstein (1968) 
have argued that such a notion is an illusion created by our use of language. Wittgenstein 
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suggests that the thoughts ‘accompanying’ or ‘underlying’ actions are inseparable from 
those actions, e.g. ““‘ There has just taken place in me the process of remembering” means 
nothing more than “I have just remembered. . ."' or ‘when I think in language, there 
aren't "meanings" going through my mind in addition to the verbal expressions: the 
language is itself the vehicle of thought.’ Ryle, of course, has argued that ‘mental states’ as 
‘ghostly’ entities do not exist, and that our descriptions of them in language are actually 
references to behavioural dispositions. It is not surprising therefore, that he attacks the 
notion that self-knowledge cannot be achieved through ‘privileged access’ to private mental 
processes. He demonstrates the logical impossibility of supposing all mental acts to be 
introspectible, and this casts doubts upon the assumption that self-knowledge is ever 
derived in this way. How then, is it achieved? Quite simply, ‘our knowledge of other 
people and ourselves depends upon our noticing how they and we behave.’ 

It is important to point out that this approach does not preclude the possibility that an 
individual may correctly report the reason for his actions. Suppose, for example, that you 
see a child step into the path of a car, and the driver swerve. If you ask the driver why he 
swerved he will say that he did so in order to avoid hitting the child. This is certainly the 
correct reason and the driver is identifying the objective cause of his action. This does not, 
however indicate any special access to private mental processes. Any other observer of the 
scene could equally identify the cause of his action. Even if an individual has a superior 
ability to identify the cause of his actions, this might only indicate that he has had more 
experience of observing his own behaviour in various situations than have other people. 
Similarly, the ability to say what you are going to do in the future does not demonstrate 
intentionality. It merely shows that you are able to predict your behaviour on the basis of 
previous experience. In practice, of course, people are not always able to identify the causes 
of their behaviour or predict their future actions. 

Psychologists who employ introspective reports do not usually state the philosophical 
assumptions on which they are working. In view of the foregoing considerations, however, 
we must regard the ability of subject to identify the causes of his behaviour in a given 
situation as a strictly empirical question. Even this approach, however, leads us into logical 
problems. We can only assess the validity of a subjective report in a situation where we 
have an independent objective method of determining the cause of the subject’s action. In 
such a situation, however, we do not need his introspective report. 

No psychologist, however, regards all cognitive processes as introspectible. It is obvious, 
for example, that we are not aware of the process which allows us to see a 
three-dimensional world — we just see it. If all such processes were reportable it would 
certainly save us from doing experiments. Why, however, are some processes in cognition 
considered to be reportable? The attitude of Sternberg (1977) is quite typical. In reviewing 
information-processing approaches to cognition, he distinguishes between fast processes 
such as word recognition, and slow processes such as problem-solving. The latter, he 
supposes, are somewhat amenable to introspective report. A big factor, however, seems to 
be the willingness of the subject to make a report, and the confidence and consistency with 
which he delivers it. 

The worrying thing about this situation is that no one appears to have any clear 
criterion for deciding which processes are introspectible and which are not. Furthermore, 
those making use of reported strategies provide no evidence that they measure what they 
purport to measure, i.e. causal processes underlying behaviour. As we shall see, there is 
considerable evidence against this hypothesis. 

I was confronted with this problem when researching into the Wason selection task. It 
will probably assist the reader to have a brief description of this task at this point. In one 
form the subject is presented with four cards which he knows to have a letter on one side 
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and a number on the other. The facing sides show ‘G’, ‘D’, ‘4’ and ‘8’. He is given the 
rule: ‘If there is a G on one side of the card, then there is a 4 on the other side of the card.’ 
He is told that the rule applies to these four cards and may be either true or false. The 
logical task is to turn over those cards and only those cards which could help him to find 
out whether the rule is true or false. The correct answer depends on how the rule is 
interpreted. If treated as an implication he should choose ‘G’ and ‘8’ since only a G paired 
with a number other than a 4 could falsify the rule. If the rule is treated as equivalence, 
then he should turn all four cards, as each could lead to a falsifying case (G and not —4, or 
not —G and 4). 

In fact, few subjects do either of these things. Most omit the logically necessary '8' and 
may also select ‘4’. In general, with a rule of the form Zf p then q, people tend to choose p 
alone, or p and q. Wason (1966) originally interpreted this as a verification bias — subjects 
are seeking the confirming instance of p and q rather than the falsifying instance of p and 
not q. This was later elaborated into an information-processing model by Johnson-Laird & 
Wason (1970). Now, as mentioned earlier, the verification theory has been revised in the 
light of Evans & Lynch's (1973) finding that subjects tend to choose the matching values p 
and q irrespective of the presence of negatives. 

A difficulty arose initially, however, in that Goodwin & Wason (1972) had found support 
for the verification rather than matching bias explanation in the introspective reports of 
their subjects. Their technique consisted of asking subjects to write down the ‘reason’ for 
their decision to select or not select, each card. Subjects appeared to demonstrate 
verification when selecting p and q, and falsification when selecting p and not q. 

Wason and I conducted some joint research to resolve this paradox (Wason & Evans, 
1975; Evans & Wason, 1976). In the first experiment, subjects were given the selection task 
both with the usual rule 7f p then q and a negative rule If p then not q. On the negative 
rule, the matching responses p and q are also logically correct. Subjects did both problems 
and were asked to justify their responses à /a Goodwin & Wason. Most subjects gave 
similar responses (p and q) on both problems, but very dissimilar verbal justifications. 
When matching on the affirmative rule, subjects demonstrated an apparent verification bias 
as in the Goodwin & Wason study. On the negative rule, however, justifications showed 
clear evidence of a falsification bias. Wason & Evans publish several protocols of subjects 
who receive the negative problem first, demonstrate apparently ‘complete insight’, which 
disappears entirely when subsequently presented with the affirmative problem. These results 
suggest clearly that the subjects are, in fact, rationalizing their responses in the context of 
the instructions given. À subsequent experiment (Evans & Wason, 1976) provided further 
support for this hypothesis. 

The evidence does not then support the notion that thought processes are available to 
introspection. On the contrary, the belief in introspection appears to rest on precisely the 
philosophical mistakes described by Ryle and Wittgenstein. Self-knowledge does indeed 
seem to depend upon observation of one's own behaviour, rather than ‘privileged access’ 
to mental processes. 

Other work needs to be re-examined from the standpoint that reasoning processes cannot 
be safely assumed to be introspectible. Transitive reasoning, for example, has been subject 
to a controversy with regard to whether visual imagery plays a functional role or not. 
Originally proposed by De Soto et al. (1965), this position has been strongly advocated by 
Huttenlocher (1968). Clark (1969) proposed an alternative explanation in terms of linguistic 
principles, which led to a rather heated controversy (see Johnson-Laird, 1972). 
Huttenlocher's theory is constructed largely on the basis of introspective data, which she 
also seems to regard as evidence for her theory. Clark maintains that there is no evidence 
that imagery plays a functional role. Shaver et al. (1975) have apparently supported the 
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imagery position by use of ‘converging operations’ similar to those employed by Brooks 
(1968) and Paivo (1971). However, as Pylyshyn (1973) has pointed out in his critique of 
mental imagery in memory research, operational definitions deriving from different 
converging operations may have no link other than a phenomenological conception of 
imagery. In a recent article (Evans, 1980) I have found alternative explanations for most of 
the converging evidence offered in support of imagery in transitive inference. If 
introspective evidence is regarded as inadmissible, the usefulness of the concept of imagery 
seems to be considerably reduced. 

We now turn to the final question, in which perhaps the most intuitively ‘obvious’ 
assumption about reasoning will be questioned. 


(3) Is reasoning sequential? 


It is generally believed that there are two main kinds of thought: free associative, 
unstructured thinking, such as occurs in dreams, daydreams, etc. — this, with certain, 
additional assumptions, corresponds to Freud’s primary process. Most experimental studies 
of thinking have, however, been concerned with secondary process thinking — goal-oriented, 
reality-based, problem-solving thought. Neisser (1963) reviews this traditional dichotomy in 
the light of the development of information processing psychology. With reference to 
Kubie’s (1958) distinction of ‘preconscious’ thinking, with its role in creativity, he 
distinguishes a main sequence of thought from accompanying parallel processes: ‘The 

* secondary process” is the main sequence itself proceeding sequentially through the steps 
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whose logic has been programmed by previous experience. Kubie's *preconscious" consists 
of multiple processes, while the sequential process is his “‘conscious”’.’ 

Most psychologists would probably accept Neisser's view that reasoning, being secondary 
process thought, is sequential in nature. The identification of the main sequence with 
consciousness is problematic since it suggests that the process is introspectible. The 
problems arising on this latter issue would suggest either that the main sequence is not 
conscious in this sense, or else that other processes also influence behaviour in reasoning 
experiments. 

The assumption that reasoning is sequential is clearly reflected in models which have 
been proposed. For example, Clark & Chase (1972) present a detailed model of 
sentence-picture verification. It is proposed that the subject works through a fixed series of 
stages: encoding of the sentence, encoding of the picture, comparison of the two codes and 
generation of a true/false response. The model assumes that the time taken to process each 
stage is simply added together to produce the total time. 

Other models of more complex reasoning processes already referred to are of a similar 
structure (e.g. Johnson-Laird & Wason, 1970; Revlis, 1975). They envisage one process, 
Boing through a series of stages to generate the response. Now, while this conception of 
reasoning is very plausible on common-sense grounds, it will be argued here that there is 
evidence against its validity in explaining data collected in reasoning experiments. The two 
main alternatives are parallel thought processes, and stochastic thought processes. 

The idea that more than one kind of thought process is involved in experiments on the 
selection task was proposed by Wason & Evans (1975). It will be recalled that in their 
experiment, response frequencies were attributed to ‘matching bias’, while verbalizations 
were regarded as rationalizations. Wason & Evans postulate a dual process theory of 
thinking in which it is proposed that two different kinds of thinking occur: Type 1 
processes, which are non-verbal and non-introspectible but control actual selections, and 
Type 2, verbal processes which underly the rationalization. It is interesting to note that the 
Type 2 process corresponds much more closely to the common-sense notion of reasoning. 
The ‘rationalization’ is itself a form of reasoning: the subject asks, in effect, ' Given the 
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structure of the problem and the instructions, what reason could I have had for the 
response I made?' The explanations offered are of a logical nature, in that they show high 
internal consistency. The striking point is, of course, that this Type 2 verbal reasoning 
process does not appear to control the actual reasoning behaviour. 

Wason & Evans cite a number of previous studies which could be regarded 
retrospectively, as evidence for dual processes. For example, when subjects are forced 
verbally (at a Type 2 level) to contradict their choices on the selection task, many refuse, 
nevertheless, to alter their selections (e.g. Wason & Johnson-Laird, 1970). At the same 
time, the subject often appears to be under stress, as though he were caught between two 
conflicting processes. (For a striking example, see the protocol cited by Wason, 1969.) An 
interesting phenomenon also occurs on Wason’s 2-4-6 problem (see Wason, 1968). This is 
a rule-learning task in which subjects are induced to obtain a wrong hypothesis which is 
very hard to refute. When told that their suggested rule is incorrect, subjects frequently 
continue to use the same rule but reformulate it verbally in different words. For example, a 
subject might say (of a three-number set) ‘the rule is that the numbers increase and that the 
difference between the adjacent numbers is the same'. On being told it is wrong, his next 
suggestion may be ‘the rule is that the middle number is the arithmetic mean of the outer 
two’. Could it be that a Type 1 process perseverates, and that a Type 2 process produces 
alternative verbalizations? 

The dual process theory, as stated by Wason & Evans, envisages the processes not so 
much in parallel as in alternation. Recent considerations of the statistical structure of 
reasoning responses, however, give a different perspective. In a recent article (Evans, 

1977 b) I argued that response frequencies generated on reasoning tasks might be accounted 
for by a stochastic approach. If 60 per cent of subjects solve a problem it could be 
supposed that they knew something the others did not, or used a superior strategy. An 
alternative approach, considered in the paper, is to suppose that all subjects have a 0-6 
probability of getting it right. 

In support of this stochastic approach, I was able to show that card-selection frequencies 
on Wason's selection task are statistically independent. This causes considerable problems 
for insight theories which attach particular significance to the combinations of cards 
chosen. If these combinations are merely chance combinations of the outcomes of 
independent statistical processes, a global insight description seems quite inappropriate. 

In formulating a specific stochastic model, it soon became apparent that a sequential 
model, along the lines of Clark & Chase, would simply not work. An additive latency 
model, such as theirs would have to become multiplicative in order to generate response 
probabilities. Thus, if a given response has a certain probability of passing through an 
interpretational stage and another of passing through a response bias stage, the overall 
probability is clearly the product. This would mean that a response which was against the 
prevailing response bias (e.g. anti-matching) would never become probable however strong 
the interpretational tendency. However, it is known that such responses, if logically correct 
do become highly probable when realistic rather than abstract content is employed (e.g. 
Wason & Shapiro, 1971). Thus an additive probability model was devised — each response 
probability was determined by three parameters — a logical tendency (arising from 
interpretation); a response tendency (arising from response bias) and a weighting factor. 
Thus the good logical performance on realistic problems can be accounted for by a shift in 
the weighting factor towards interpretation. The strength of matching bias, as such, does 
not diminish — only the weighting which is attached to it. 

Now, while an additive latency model implies sequential processing, an additive 
probability model implies parallel processing. In effect, the response is made either on the 
basis of logic or on the basis of matching. The weighting factor gives the probability of 
making the decision on one basis or the other. 


Current issues in the psychology of reasoning 237 


By some considerations which go beyond the published papers, we can see that a revised 
dual process theory can be reconciled with the stochastic model. What is referred to as the 
interpretational or logical process in the Evans (1977 b) paper could be regarded as a Type 
2 process: the response (matching) bias component could be equated with a Type 1 
process. The dual process theory is revised in that it is now suggested that a Type 2 process 
is also involved prior to making the response, and that its influence accounts for the logical 
component of the behaviour. The additive probability model corresponds to the notion of a 
psychological conflict — Type 1 and Type 2 processes compete for the control of the 
behaviour. An attractive feature of the revised theory is that it supposes that a verbal 
rational process may acquire control of the behaviour — for example, when realistic 
materials are used. 

Type 2 control should not, however, lead one to expect the process to be introspectible. 
It is still supposed that the introspective report is a rationalization. It is a product of Type 2 
processes, rather than a description of them. Some Type 1 thinking may also influence the 
verbal justifications. This could explain a phenomenon noted by Wason & Evans (1975), 
namely that subjects’ explanations focus on the matching values which might arise on the 
other side of the card. Matching would thus continue to direct attention at a Type 1 level, 
influencing the Type 2 construction of a verbal rationalization. 

I should make it clear that these ideas are highly speculative on present data, and that a 
detailed programme of experimental research is being planned by Wason and myself to 
investigate the problem further. What is clear, however, is that a sequential, non-stochastic 
conception of reasoning does not appear able to account for the data in these experiments. 


Conclusions 


The answer to all three questions posed has been negative. People do not reason logically, 
and their reasoning is neither introspectible nor sequential in nature. We must, however, 
question the external validity of the experiments upon which these conclusions are based. 
In answering the three questions, the word 'reasoning' has been implicitly defined as 'that 
which goes on in reasoning experiments'. /f this definition is acceptable then reasoning 
certainly does not possess the characteristics discussed. 

Not all psychologists, however, accept this kind of operational definition. Van Duyne 
(1973), for example, talks of subjects ‘trying to reason’ in these experiments. If reasoning is 
something subjects try to do, without necessarily succeeding, it obviously cannot be 
equated with the sum total of the processes generating the data. Van Duyne is evidently 
bringing some prior conception of reasoning to bear. Of course, we could define reasoning 
a priori as logical, introspectible and sequential. It is questionable whether such a process 
actually exists, and even more doubtful whether it plays any role in reasoning experiments. 
My real concern is that we should not, by virtue of labelling a paradigm as reasoning 
research, allow our prior beliefs about 'reasoning' to bias our interpretation of the data. 

The escape from this dilemma is, J think, to regard reasoning experiments as specialized 
problem-solving tasks which permit the study of thought processes. Their external validity 
for measuring 'reasoning' should not be assumed, any more than research on concept 
identification should be taken as telling us anything about real-life conceptual thinking. 
Viewed in this light, it seems to me that research on these tasks has suggested a number of 
interesting possibilities about the nature of thought. For example, it is very interesting to 
have to postulate dual thought processes to account for reasoning data, albeit on an 
artificial task, unrepresentative of real-life problems. If nature endowed us with dual 
processes, it was certainly not for the purpose of solving the Wason selection task. At the 
same time, it is clearly necessary for hypotheses derived about the nature of thought to be 
tested on a variety of other kinds of task. 

Having said this, it must be admitted that research on deductive reasoning has not yet 
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told us a great deal about the nature of thinking. There is a limit to how much progress 
can be made in a field where most of its researchers rely on three fundamental assumptions, 
all of which are wrong. Also, research designs (including mine) have been too static. The 
independent variables (linguistic structure, etc.) have been too far removed from the subject 
to allow much insight into his thought process. Since a more dynamic approach cannot 
rely on introspection, it must find other means. Experiments currently going on in my own 
department are using converging operations, such as interference techniques, borrowed 


from other areas of cognitive psychology. 


In essence, research on ‘reasoning’ has not really achieved its intended discoveries about 
the rationality of man, but has accidentally produced some interesting questions about the 
nature of thought. We must realize that what we are studying in these experiments is not 
the kind of reasoning involved in solving real-life problems; rather it is the reactions of 
subjects to highly novel and artificial problems under somewhat stressful conditions. The 
artificiality of the psychological experiment is inescapable — merely presenting the problems 
in *thematic' form does little to reduce the problem. To understand real-life reasoning we 
must study it, ethologically, under real-life conditions. At the same time experimental 
studies of thinking can give us useful psychological information, provided that we do not 
lose sight of the total situation to which the subject is responding. 


References 


BEGG, I. & Denny, J. P. (1969). Empirical 
reconciliation of atmosphere and conversion 
interpretations of syllogistic reasoning errors Journal 
of Experimental Psychology, 81, 351—354 

Brame, M. D. S. (1978). On the relation between the 
natural logic of reasoning and standard logic 
Psychological Review, 85, 1—21 

BROOKS, L. R (1968). Spatial and verbal components 
of recall Canadian Journal of Psychology, 22, 
349-368. 

CHAPMAN, A C. & CHAPMAN, J P (1959). Atmosphere 
effect reexamined. Psychological Review, 58, 
220—226. 

CLARK, H. H. (1969) Linguistic processes 1n deductive 
reasoning Psychological Review, 76, 387-404. 

CLARK, H H. & CuAsE, W G. (1972). On the process 
of companng sentences against pictures. Cognitive 
Psychology, 3, 472-517 

De Soro, C B, Lonpon, M. & HANDEL, S (1965). 
Social reasoning and spatial paralogic. Journal of 
Personality and Social Psychology, 2, 513-521. 

Evans, J. St B T. (19722) On the problems of 
interpreting reasoning data: Logical and 
psychological approaches. Cognition, 1, 373-384. 

Evans, J St B. T. (19725). Reasoning with negatives 
British Journal of Psychology, 63, 213-219. 

Evans, J. St B T. (1972c) Deductive reasoning and 
linguistic usage Unpublished PhD Thesis, University 
of London 

Evans, J. St B T (1972d). Interpretation and 
matching bias in a reasoning task. Quarterly Journal 
of Experimental Psychology, 24, 193-199. 

Evans, J Sr B. T. (1977a) Linguistic factors in 
reasoning. Quarterly Journal of Experimental 
Psychology, 29, 297-306. 

Evans, J St B T. (19776) Towards a statistical 
theory of reasoning Quarterly Journal of 
Experimental Psychology, 29, 621-635 

Evans, J. St B. T. (1978) The psychology of deductive 
reasoning. In A. Burton & J Radford (eds), Thinking 
in Perspective. London: Methuen 


Evans, J Sr B. T. (1980). Thinking: Experiential and 
information processing approaches. In G. Claxton 
(ed.), Cognitive Psychology: New Directions 
London: Routledge & Kegan Paul (in press). 

Evans, J Sr B. T. & Lyncu, J. S. (1973). Matching 
bias in the selection task British Journal of 
Psychology, 64, 391-397. 

Evans, J. St B. T. & NEWSTEAD, S. E. (1977) 
Language and reasoning: À study of temporal 
factors. Cognition, 5, 265-283. 

Evans, J. Sr B. T. & Wason, P. C. (1976). 
Rationalization in a reasoning task. British Journal 
of Psychology, 67, 479—486. 

FALMAGNE, R. J (ed.) (1975) Reasonmg. 
Representation and Process. New York. Wiley 

GALTON, F. (1879) Psychometnc experiments Brain, 
2. 148-162 

Goopwin, R. Q. & Wason, P C (1972). Degrees of 
insight. British Journal of Psychology, 63, 205-212. 

HENLE, M. (1962) On the relation between logic and 
thinking. Psychological Review, 69, 366-378 

Hiren, C. J & BADDELEY, A. (1976). Verbal reasoning 
and working memory Quarterly Journal of 
Experimental Psychology, 28, 603-622. 

HuMPHnEY, C (1951). Thinking. An Introduction to its 
Experimental Psychology. London. Methuen 

HUTTENLOCHER, J. (1968) Constructing spatial images: 
A strategy in reasoning. Psychological Review, 75, 
550-560 

JOHNSON-LaiRD, P. N. (1972). The three-term series 
problem. Cognition, 1, 57-82. 

JoHNSON-LamD, P. N. (1975). Models of deduction In 
R J Falmagne (ed.), Reasoning Representation and 
Process. New York: Wiley 

JoHNSON-LairD, P. N. & STEADMAN, M. (1978). The 
psychology of syllogisms. Cognitive Psychology, 10, 
64-99, 

JonNsoN-LAIRD, P.N & WASON, P. C. (1970). A 
theoretical analysis of insight into reasoning task. 
Cognitive Psychology, 1, 134-138. 

JoHNson-Lairp, P N. & Wason, P C (1977). 


Current issues in the psychology of reasoning 239 


Thinking: Readings in Cognitive Science. London: 
Cambridge University Press. 

KosstYN, S. M. (1975). Information representation in 
images. Cognitive Psychology, 7, 341—370. 

KossLvN, S M. & Pomeranz, J. R. (1977). Imagery, ` 
propositions and the form of internal 
representations. Cognitive Psychology, 9, 52-76. 

Kunig, L. S. (1958). Neurotic Distortions of the Creative 
Process. Lawrence: Kansas University Press. 

MANKTELOW, K. I. & Evans, J Sr. B T. (1979). 
Facilitation of reasoning by realism: Effect or non- 
effect? British Journal of Psychology, 70, 477-488. 

Neisser, R. (1963). The multiplicity of thought. British 
Journal of Psychology, 54, 1-14. 

Nissgrr, R. E. & Wirson, T. D. (1977). Telling more 
than one can know: Verbal reports on mental 
processes. Psychological Review, 84, 231—259. 

Osuerson, D. (1975). Logic and models of logical 
thinking. In R. J. Falmagne (ed.), Reasoning: 
Representation and Process. New York: Wiley. 

Parvio, A. (1971). Imagery and Verbal Processes. New 
York: Holt, Rinehart & Winston. 

Pytysuyn, Z. W. (1973). What the mind's eye tells the 
mind's brain: A critique of mental imagery. 
Psychological Review, 80, 1-24. 

Quinton, G. & FELLOWS, A. J. (1967). ‘Perceptual’ 
strategies in the solving of three term series 
problems. British Journal of Psychology, 66, 69—78. 

Revuts, R. (1975). Syllogistic reasoning: Logical 
decisions from a complex data base. In R. J. 
Falmagne (ed.), Reasoning: Representation and 
Process. New York: Wiley 

Revuin, R. & Mayer, R. E (1978). Human Rrasa: 
New York: Wiley. 

ROBERGE, J. J. (1971). Some effects of negation on 
adults’ conditional reasoning abilities. Psychological 
Reports, 29, 838-844. 

Roserce, J. J. (1976). Reasoning with exclusive 
disjunctive arguments. Quarterly Journal of 
Experimental Psychology, 28, 419-427. 

Rye, G. (1949). The Concept of Mind. London: 
Hutchinson. 

SELLS, S. B. (1936). The atmosphere effect: An 
experimental study of reasoning. Archives of 
Psychology, no. 200. 


SHaver, P., PERSON, L. & Lana, S. (1975) Converging 
evidence for the functional significance of imagery in 
problem solving. Cognition, 3, 359-375. 


. SMEDSLUND, J. (1970). On the circular relation between 


logic and understanding. Scandinavian Journal of 
Psychology, 11, 217-219. 

SmaTu, E. R. & Mitzer, F. D. (1978). Limits on 
perception of cognitive processes: A reply to Nisbett 
& Wilson. Psychological Review, 85, 355-362 

STERNBERG, R. J. (1977). Intelligence, Information 
Processing and Analogical Reasoning. New York: 
Wiley 

TAPLIN, J. E. & STAUDENMAYER, A. (1973). 
Interpretation of abstract conditional sentences in 
deductive reasoning. Journal of Verbal Learning and 
Verbal Behavior, 12, 530—542. 

VAN Duyne, P. C. (1973). A short note of Evans 
criticism of reasoning experiments and his matching 
bias hypothesis. Cognition, 2, 239-242. 

WASON, P. C. (1966). Reasoning. In B. M. Foss (ed.), 
New Horizons in Psychology, 1. Harmondsworth, 

' Middx: Penguin. 

Wason, P. C. (1968). ‘On the failure to eliminate 
hypotheses...’ a second look. In P. C Wason & 

P. N. Johnson-Laird (eds), Thinking and Reasoning. 
Harmondsworth, Middx Penguin. 

WASON, P. C. (1969). Regression in reasoning? British 
Journal of Psychology, 60, 471—480. 

WASON, P. C. & Evans, J. ST B. T. (1975). Dual 
processes in reasoning? Cognition, 3, 141-154. 


WASON, P. C. Jonnson-Lamp, P. N. (1970). A conflict 


between selecting and evaluating information in a 
reasoning task. British Journal of Psychology, 6, 
509—515. 

Wason, P. C. & JoHNSON-LamD, P. N. (1972). 
Psychology of Reasoning: Structure and Content: 
London: Batsford. 

Wason, P. C. & SHapmo, D. (1971). Natural and 
contrived experience in a reasoning problem. 
Quarterly Journal of Experimental Psychology, 23, 
63-71. 

WITTGENSTEIN, L. (1968). Philosophical Investigations. 
Oxford: Blackwell. l 

WOODWORTH, R. S. & SgL1s, S. B. (1935). An 
atmosphere effect in syllogistic reasoning. Journal of 
Experimental Psychology, 18, 451-460 


Received 25 November 1978; revised version received 9 March 1979 


This paper is a (slightly modified) English language version of the paper, ' Aspects actuels de la psychologie du 
raisonnement' to appear in Bulletin de Psychologie, 1979. 
Requests for reprints should be addressed to Dr J. St B. T. Evans, Department of Psychology, Plymouth 


Polytechnic, Drake Circus, Plymouth PL4 8AA. 


British Journal of Psychology (1980), 71, 241-246 Printed in Great Britain 241 


Speech modification in near-mute schizophrenics 


Barry B. Hart 





Twelve male near-mute schizophrenics were divided into a model-plus-reinforcement group; a 
model-only group; and a control group, to test the hypothesis that the presence of a verbalizing 
model to 35 mm slides would serve as an eliciting stimulus to increase speech output. 

Significant increases in verbalizations were found over treatment sessions, the model-plus- 
reinforcement condition showing the most marked improvement. Generalization to the ward was 
found for only three subjects. 





The importance of models in transmitting and controlling speech in the treatment of 
near-mute psychotic patients has interested researchers since 1963. In their pilot work, 
Bandura & Walters (1963) found that reinforcement alone was unlikely to change speech 
output unless a model was provided. In one of the first experimental accounts of using 
imitation to reinstate verbal output in this type of subject however, Sherman (1965) 
demonstrated the functional role of contingent reinforcement in maintaining speech. A 
small generalization effect was also found. 

In view of this work, Wilson & Walters (1966) investigated the effectiveness of two 
treatments: repeated exposures to a verbalizing model to colour transparancies followed by 
reward for speech; and, repeated exposure to the model without reward for imitation. A 
control group was shown the slides but had no model nor reinforcement for speaking. 
Although there was no significant sessions x groups effect when the three different 
treatments were implemented, a significant linear trend was found for the 
model-plus-reinforcement group. Attendant ratings of the patients, however, indicated that 
in all groups patients spoke even less in the ward after all the treatment sessions than 
beforehand. The researchers advised that future studies should make use of a setting more 
closely approximating the patient's environment in the hospital. 

By making a patient's food contingent on speaking, Neale & Leibeit (1969) increased 
verbalizations 1n their one subject to the extent that she was eventually deemed well 
enough to be transferred from a closed to an open ward. Throughout the experiment, the 
role of therapist was passed through a succession of ward aides, as well as to a non-mute 
patient. Only partial generalization was found, however, when the subject was transferred 
to her new ward, as the contingencies were no longer programmed. 

Much research on social imitation in adult psychotics is felt to have been limited to 
paradigms where response contingent reinforcers are administered to the subject. Chartier 
& Ainley (1974) felt this technique was important for behavioural change of existing 
repertoires but not for the acquisition of new responses via observational learning. They felt 
that adult psychotics could acquire new behaviour through observation of a model without 
contingent rewards being offered, and that a previously positive relationship with a model 
would facilitate such learning. Their results indicated that subjects who had interacted with 
a non-contingently warm and reinforcing model showed equivalent task-relevant imitative 
learning to subjects exposed to a neutral model or no model. They did, however, show 
higher levels of incidental learning. It was suggested that incidental imitation alone may 
account for a patient's acquisition of maladaptive behaviour patterns during hospitalization 
via modelling patient-patient and patient-staff interactions. 

In contrast, while Bandura (1965) acknowledges that an individual may covertly acquire 
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modelled events, he would argue that observational learning will probably not be 
translated into action until positive incentives are introduced. In testing thus, Robins & 
Wexley (1975) compared the effects of modelling-plus-reinforcement (MR); reinforcement 
alone (R); and demand characteristics (D) on individual verbal output in leaderless group 
discussions. Subjects who spoke the least in an initial discussion were chosen as target 
persons (TP) and received one of the three treatment conditions, depending on their group. 
Target persons 1n all groups showed increases in verbal output, but MR was superior to 
both other conditions. This was consistent with Bandura's mediational theory of modelling 
(1969), which states that when modelling is used, the subject engages in multiple 
observational trials using representational mediators instead of overtly responding. After 
the modelling stimuli have been coded into images or words in memory, they function as 
mediators for subsequent reproduction. The MR group probably had these mediators 
available to them, while the R group did not. The modelling therefore enhanced the effect 
of the reinforcement. 

In assessing the specific utility of modelling in changing the social behaviour of 
psychiatric patients, Jaffe & Carlson (1976) found that their results failed to indicate a 
consistent superiority of modelling over verbal instructions. Both methods, however, 
produced either the same or better results than an attention group, where subjects 
interacted with the experimenter without the benefit of observational cues. It was felt that 
this attention group, which failed to produce a change in patients, was analogous to the 
social stimulation that usually makes up a staff-patient interaction on long-stay wards. The 
results indicated that patients respond more appropriately if the nature of the desired 
behaviour is made explicit. 

The present study was concerned with studying the schizophrenic patient who had 
learned through institutionalization that 1t was many times not necessary to speak in 
hospital in order to survive its regime. It was felt that mute gestures were inadvertently 
reinforced on wards in so far as patients could get attention (e.g. getting an aide to light a 
cigarette) simply by non-verbal means. Chartier & Ainley (1974) would argue that this 
interaction would serve as maladaptive behaviour that other patients could model. 
Likewise, it was hypothesized that by using the same technique that gave rise to the 
‘mutism’ (i.e. modelling), coupled with reinforcement for talking, it would be possible to 
increase verbal output in this type of subject. 

The design of Wilson & Walters (1966) study was incorporated in the present study, 
except that sessions were carried out 1n a therapy room on the ward with which all subjects 
were familiar, in order to aid any generalization effect In addition, an increased percentage 
of slides taken of patients in the hospital were used, in the expectation that subjects would 
be more likely to respond to a scene with which they were familiar. Lastly, an additional 
extinction session was included to study the effect of prompting. 


Method 
Subjects 


Twelve male schizophrenics, who had spent at least a year on a token economy ward for severely 
regressed patients, served as subjects. They ranged in age from 47 to 68 years, with a mean age of 
59-6 years at the commencement of the study. The men were chosen from a ward of 33 male patients, 
on the basis of the ward supervisor's rating of their past speech output as being minimal. 

Four patients were assigned to each of three groups: a model-plus-reinforcement (MR) group, a 
model-only (M) group, and a control group (C). All subjects were re-rated on the ward following 
completion of the study, to test for possible generalization effects. 

As all patients lived on a token economy ward, the penny tokens provided a relevant source of 
secondary reinforcement. 
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Apparatus 


The 6 x 4 m behaviour treatment room contained a Kodak Carousel automatic slide projector Model 
550, by which means each slide was projected on a front projection screen approximately 4 m away. 

The stimuli consisted of 12 sets of 20 and 5 sets of 10 Ektachrome 35 mm colour transparencies 
(total 290). The sets of 10 were used for the first session, as well as the ‘extinction’ trials (1.e. sessions 
8, 15, 16 and 17). During the sessions when a model was used, the sets of 20 slides were used. Each 
subject would therefore be presented with 10 new slides each day. 

There were four content categories within each set of slides, the order of presentation arranged as 
follows: (1) scenery slide, (2) ward slide; (3) animal slide; (4) ward slide; (5) people outside ward 
slide; (6) ward slide; (7) scenery slide, (8) ward shde, (9) people outside ward slide, and (10) animal 
slide. There were therefore two slides in each of three categories and four ‘ward’ slides in every group 
of 10 slides that was presented. This order was maintained for all sets of slides. 

A Sony reel-to-reel tape recorder was used to record each subject's speech output, so that no 
verbalization went unnoticed 


Procedure 


All subjects were run over the same span of 17 days for repeated sessions of training. In the MR and 
M groups, two sets of 10 slides were used in order that each subject would be presented with slides to 
talk about which were different from those described by the model. The subject was to imitate the 
model's behaviour of talkativeness, not his actual words. 

As C group received no modelling until session 9, only one set of slides was needed for them each 
session It must be emphasized that subjects in the MR and M groups received no more exposure to 
the slides than did control subjects, as the slides described by the model were always out of the 
subject's view. Although intervals between sessions varied slightly because of hospital routine, no 
subject was given more than one training session on a single day. 

Subjects were assigned to three groups matched for verbal performance They received different 
treatments during sessions 2-7, and all subjects received model-plus-reinforcement during sessions 
9--14. 

Subjects were brought from the ward to the testing room individually. The procedure varied from 
session to session as described by Wilson & Walters (1966, p. 63) * 


Session 1 This session was employed to establish a base-line level of responding to the stimulus 
material. The subject was seated in front of the screen, while the experimenter seated himself to one 
side of the screen facing the subject. The experimenter instructed the subject that he would see on the 
screen some pictures and requested him to say what was happening in each one. 

During the presentation of the slides, the experrmenter remained silent except for providing a 
standard series of prompts, such as ‘What do you see in this picture?’ or, ‘Tell me about this one’ 
Prompts were given after a 10 s delay following the onset of the presentation of a slide if the subject 
had up to that point remained silent. Prompts were never given more than twice per slide. Slides for 
all sessions were presented for 30 s. 


Sesstons 2-7 During these sessions, the MR group was exposed to a verbalizing model and was also 
reinforced for imitating the model's behaviour of talkativeness. That is, in order to receive 
reinforcements, the subject had only to respond to the slides by talking about them The model 
commenced the sessions by taking the seat in front of the screen and speaking rapidly and 
continuously in response to each slide as one of the sets was projected on the screen. The subject, 
whose view of the slides was obstructed by a baffle, meanwhile listened to the model talk. The model 
and the subject then changed places; the model gave instructions similar to those issued ın session 1, 
and the second set of slides was presented to the subject Prompts were given in the same manner as 
in session 1. 

During the presentation of the slides, subjects were reinforced with penny tokens for the production 
of words. The fixed ratio schedule of reinforcement utilized for any subject during a particular session 
was based on the mean number of words per slide spoken by that subject during the previous session. 


* Procedure used as per Wilson & Walters (1966, p 63) with author's permission, reproduced with kind 
permission of Pergamon Press 
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A second group of four subjects (M) was exposed to the model but received no reinforcements 
during the presentation of the second set of slides. For the control group (C), sessions 2-7 were 
conducted in the same manner as session 1. Both the M and C groups received an end pay-off for 
attending the sessions, the amount being based on the mean amount received by the MR group for 
the equivalent training session. Thus, all groups received the same net amount of ‘take-home’ pay, 
the difference between the MR subjects and the other two groups being the contingencies of the 
reinforcement, and, when the subjects were paid. That is, reinforcements for MR were contingent on 
the subject's producing a number of words, whereas tokens were dispensed at the end of the session 
whatever the nature of the subject's verbal performance in the latter two groups. These subjects were 
told that they were given the tokens for having attended the session. 


Session 8 For all subjects, session 8 was conducted in the same manner as session l, in order to 
determine if any increases in verbal output resulting from the social learning procedures would be 
maintained when the procedures were discontinued. An end pay-off of 10 tokens was given to all 
subjects after session 8 to maintain motivation for attending the sessions. 


Sessions 9-14 All subjects were now given, for six sessions, the model-plus-reinforcement treatment. 


Session 15 The modelling-plus-reinforcement procedure was discontinued; the procedure used in 
session 8 being reinstated for all groups. All subjects were given an end pay-off of 15 tokens for 
having attended. 


Session 16 Subjects received the same treatment as in session 15, except that verbal prompts were 
discontinued. Again, 15 tokens were given to all subjects as an end pay-off. 


Session 17 Subjects received the same treatment as in session 15; i.e. prompts were reinstated. This 
was done to test the effectiveness of the experimenter's attention to the subject via prompts. There 
was a final pay-off of 15 tokens after this session. 


Results 


Figure 1 shows the mean number of words per slide for sessions 1-17 as a function of 
treatment procedures and training sessions. The graph indicates that during sessions 2-7, 
all groups increased speech output. Three correlated ¢ tests indicate these increases to be 
significant, the MR group showing the most improvement (t = 17:62, d.f. = 3, P < 0-01); 
the M group next in improvement (t = 4-94, d.f. = 3, P < 0-01); and the control group 
showing the least in comparison (t = 4-13, d.f. = 3, P < 0-05), but still a significant 
improvement. 

To determine whether the three treatments performed over the groups from sessions 2-7 
had differential influence, each subject's data for the six sessions was pooled to give four 
independent observations per group. The one-way analysis of variance performed on this 
data indicated that these differences were significant (F — 9-62, d.f. — 2,9, P « 0-01). The 
Scheffé Multiple Comparison Test showed these differences to lie between the MR and C 
groups (P « 0-05); the MR versus M and C groups combined (P « 0-05); and the C versus 
the MR and M groups combined (P « 0-01). 

When base-line procedures were reinstated in session 8, both experimental groups 
showed a drop in verbal output, the drop being more marked in the M group. The control 
group, however, markedly increased the level of responding during this session, a 
paradoxical finding, since session 8 was no different from sessions 1—7 for this group. 

When treatment was reinstated in session 9, the efficacy of the modelling and reinforcement 
was apparent. A comparison between the data from sessions 9—14 indicates that subjects in 
what had previously been the control group made the most marked progress during this 
part of training (t = 12:83, d.f. = 3, P < 0:01). A one-way analysis of variance indicated 
significant differences between the groups over these trials (F = 4 27, d.f. = 2,9, P < 0-05). 
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Figure 1. Mean verbal productivity of three groups of schizophrenics over 17 treatment sessions. 
Sessions 8, 15, 16 and 17 were ‘extinction’ sessions (see text). During sessions 9-14 all subjects 
received modelling-plus-reinforcement. ——, model-plus-remforcement ; — — — —, model only, 
-= = , control. 


The Scheffé test showed this difference to lie solely between the C versus the MR and M 
groups combined (P « 0-05). 

During session 15, only three subjects increased verbalization from the previous session; 
all other subjects dropped markedly. 

When prompts were taken away in session 16, an uncorrelated : test indicated a 
significant drop in verbalizations (t = 7 55, d.f. = 22, P « 0-01). When prompts were 
reinstated in session 17, an uncorrelated ¢ test indicated significant improvement in verbal 
behaviour (1 = 4-88, d.f. = 22, P < 0-01). 

The analysis of variance performed on the data between individual groups of slides over 
all subjects, to see if the improvement found in verbal output could be attributed at all to 
slide content, indicated non-significance (F = 0-76, d.f. = 3,9, P > 0-05). 


Discussion 


The statistical data provide evidence that the social learning procedures utilized in this 
study are capable of increasing the verbal behaviour of regressed schizophrenics. All groups 
significantly increased speech output over all treatment conditions. From previous research 
(Wilson & Walters, 1966; Robins & Wexley, 1975), it was expected that MR subjects 
would show the greatest performance increases. In fact, the Scheffé test did indicate that 
although modelling was more influential than no treatment, modelling and reinforcement 
together were the most powerful determinants of change. As the control group did improve 
over sessions 2-7, however, the importance of non-specific ‘relationship’ factors (e.g. 
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selecting the patients; the attention given to them throughout the sessions; and the end 


pay-offs) must not be overlooked. 


It should be noted that the contro] group’s increase on session 8 (extinction) can be 
attributed to one subject’s heightened output, rather than that of the whole group. 

It is also noteworthy that when all subjects were given the modelling-pius-reinforcement 
treatment over sessions 9-14, all groups showed a marked increase in speech output. This 
was not, however, to the extent of the MR group's increase over sessions 2—7. This suggests 
that the combination of modelling and reinforcement, from the outset, is the most 
successful way to effect desired change in verbal output. 

Although speech dropped in session 15, it can still be argued that the data suggest the 
presence of a frustrative non-reward effect (also found by Wilson & Walters, 1966), as the 
mean number of words per slide still averaged from 11 to 14. 

When prompts were discontinued in session 16, all groups showed a significant drop in 
performance. To test whether the prompts had taken on the characteristics of a 
discriminative operant, session 17 was added, where prompts were reinstated. A significant 
increase in verbalizations to the 0-01 level was found. 

By the end of the study, subjects were responding in the experimental situation on an 
average of 8-11 words per slide, without modelling or reinforcement, but with prompting 


only. 


Examination of pre- and post-treatment ward ratings, however, indicated a 
generalization effect for only three subjects. This either indicates that the present findings 
are clinically irrelevant, or, that ward staff need to be briefed more fully either to maintain 
or change the programmed contingency pattern. 
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Effects of subvocal suppression, articulating aloud and noise on sequence 
recall 


John Wilding and Naresh Mohindra 





Subjects were required to reproduce in order a sequence of five letters; the set of letters was known so 
only memory for sequence was tested. Experiment 1 showed that suppressing subvocal rehearsal by 
saying ‘the’ continuously during list presentation and until recall depressed performance to the same 
level on acoustically confusable and non-confusable lists. Listening to 85 dBC white noise during list 
presentation improved performance on acoustically confusable lists in non-suppression conditions and 
had no effect in suppression conditions. The result refutes the hypothesis that noise suppresses inner 
speech. Expts 2, 3 and 4 showed that articulating the items aloud during list presentation and until 
recall improved performance when lists were presented at }s per item and depressed it when they 
were presented at 2 s per item. Improvement occurred under 85 dBC white noise in Expts 2 and 4, 
but the improvement was only significant in non-articulation conditions. It is suggested that noise 
increases subvocal articulation and that both noise and articulation increase maintenance rehearsal at 
the expense of elaboration rehearsal. 








Evidence has been accumulating recently that high intensity white noise improves ordered 
recall (Hockey & Hamilton, 1970; Hamilton et al., 1972; Daee & Wilding, 1977; Hamilton 
et al., 1977; Millar, 1979), though several experiments requiring ordered recall have found 
no effect of noise (Miller, 1957; Murray, 1965a; Sloboda & Smith, 1968; Haveman & 
Farley, 1969; Davies & Jones, 1975) or an impairment (Wilkinson, 1975; Salame & 
Wittersheim, 1978) so that it is still unclear what features of the task are important in 
determining the direction of noise effects. 

Noise has also been found to reduce semantic organization in free recall (Hórmann & 
Osterkamp, 1966; Daee & Wilding, 1977; Smith, 1978). In the only experiments using more 
than two noise levels Daee & Wilding (1977) found that the effects were non-monotonic, 
maximal sequence recall and minimal semantic organization occurring at a noise level of 
75 dBC. 

Other arousers and also de-arousers have been shown to reduce semantic organization in 
free recall, such as anxiety (Mueller, 1976), and alcohol (Dornic, 1974; Parker et al., 1974; 
Rosen & Lee, 1976; Birnbaum et al., 1978); Ghoneim & Mewaldt (1977) found that after 
taking scopolamine and diazepam, subjects performed worse on categorized lists in delayed 
free recall but not on lists of random words. 

Folkard (1979) found greater use of semantic cues and less use of acoustic cues in 
ordered recall in a group tested in the afternoon (high arousal) than one tested in the 
morning. Schwartz (1975) has argued from results derived from a different task that 
subjects high in introversion and neuroticism make less use of semantic information. 
Dornic (1974) argued that task difficulty is the critical factor and found that addition of a 
secondary task, higher information load or increased age reduced category clustering in 
recall. 

Few direct measures have been made of the effect of arousers other than noise on recall 
of order. Davies & Jones (1975) and Fowler & Wilding (1979) found that incentives aided 
recall in order and recall of spatial position; as indicated above noise aids recall of order, 
but it impairs recall of spatial position (Hockey & Hamilton, 1970; Davies & Jones, 1975; 
Fowler & Wilding, 1979) so incentives and noise appear to act differently. Dornic (1974) 
found that addition of a secondary task, higher information load and alcohol all reduced 
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recall of item information but not of order information and also showed that preventing 
subjects from verbalizing items internally affected recall of order information but not of 
item information. Data from tasks requiring ordered recall, such as digit span and serial 
recall, are confusing and do not support the view that arousers in general improve 
short-term recall of order (but as pointed out above the effect of noise on such tasks is 
uncertain). Blake (1967) and Baddeley et al. (1970) found worse serial recall later in the day 
when arousal is higher, but Jones et al. (1979) found no significant effects with auditory 
presentation and afternoon superiority with visual presentation and Folkard (1976) found 
running memory span superior later in the day. Jones et al. (1979) found a negative 
relation between arousal measured by the EEG and digit span. Parker et al. (1974) and 
Rosen & Lee (1976) found alcohol reduced digit span. Weingartner & Faillace (1971) found 
an adverse effect of alcohol on serial learning, but Korman et al. (1960) found alcohol 
counteracted the adverse effect induced by the expectation of an electric shock in a serial 
learning task. Folkard (1976) found higher muscle tension reduced serial recall. Stress and 
high state anxiety in several studies have been found to impair digit span, but trait anxiety 
usually has no effect (Eysenck, 1977, p. 229). Nicotine has been found to improve ordered 
recall (Andersson, 1975). 

Faced with such conflicts in the evidence, it is best to treat noise as a distinct type of 
stimulus and to attempt no predictions based on results obtained with other types of 
arousing stimulation. Several relatively specific suggestions have been advanced to try to 
explain the effects of noise. General explanations in terms of arousal are not helpful, even if 
the data were consistent, since they do not specify the precise mechanisms involved in the 
obtained results. Hamilton et al. (1972) suggested noise increased attentional capacity 
which was then used to process additional retrieval cues, but Davies & Jones (1975) and 
Fowler & Wilding (1979) have criticized this interpretation. Fowler & Wilding suggested 
that existing evidence supports the view that it is order cues specifically rather than any 
available additional cues that are processed in noise. Schwartz (1975) concludes from an 
experiment looking at extraversion, neuroticism and recall strategy, that under high arousal 
information is processed physically rather than semantically. The exact relation between his 
findings and conclusion and the order effects found with noise are, as yet, unclear. Dornic 
(1974) has suggested that increased task difficulty induces use of a ‘lower order’ memory 
process storing information literally without semantic processing. Daee & Wilding (1977) 
suggested that noise prolongs stimulus traces, facilitating the development of sequential 
associations between items. Hamilton et al. (1977) draw a directly opposed conclusion that 
arousal, including noise, speeds up the rate at which information is processed through the 
system. This theory is discussed in more detail later. Folkard (1976) argues that in high 
arousal the articulatory loop in Baddeley & Hitch’s (1974) model of working memory is 
impaired. Poulton (1977) makes a similar suggestion that noise suppresses ‘inner speech’. 
However Poulton also argues when explaining results where noise improves performance 
that subjects can subarticulate ‘more loudly’ to compensate for noise and this can explain 
neglect of incidental stimuli (narrowing of attention, as found by Hockey & Hamilton, 
1970; Davies & Jones, 1975; Fowler & Wilding, 1979) and better recall of the later items in 
a list at the expense of the earlier ones (Hamilton et al., 1977; Millar, 1979). To complicate 
matters further Poulton also allows that arousal changes in noise may compensate for or 
exaggerate other effects of the noise, depending on the exact combinations involved. 

Eysenck (1977, p 178) distinguished two main opposed views on the effects of arousal, 
that of Schwartz, to whom we may add Dornic, that there is impairment of the central 
executive in the Baddeley & Hitch (1974) system, and that of Folkard, to whom we may 
add Poulton (with the qualifications above) that there is impairment of the articulatory 
loop in that system. Neither of these views seems capable of explaining all the results 
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obtained with noise. Schwartz cannot explain the apparent restriction of improvement in 
noise to order information, and Folkard fails to explain both the improved recall of order 
information and impaired recall of semantic information. His arguments in support of his 
view are criticized in detail below. 

Daee & Wilding (1977) did not present their suggestion of increased trace duration in 
terms of the Baddeley & Hitch model; their suggestion implies a less efficient executive 
working memory due to problems in clearing unwanted information, and a more efficient 
articulatory loop, due to the longer survival of items before recirculation is required. They 
also point out that their results are explicable if subjects try to rehearse more in noise, 
producing improved order recall until the noise levels become so high as to disrupt this 
process. This is similar to Poulton's second suggestion; in fact the *extended trace' theory 
and Poulton's louder subarticulation are not necessarily incompatible since ‘louder’ 
internal speech may produce more durable traces. Daee and Wilding's theory, however, 
would apply to all types of memory trace, while Poulton's theory applies specifically to 
articulatory-acoustic traces. 

The experiments to be described here had the aim of determining whether the effects of 
noise on recall of order are explicable in terms of either suppression or increased use of an 
articulatory loop. Levy (1971), Baddeley & Hitch (1974), Baddeley et al. (1975), Healy 
(1975) and Millar (1979) have used concurrent verbal activity to suppress functioning in the 
articulatory loop. Suppression of this kind depresses recall of all but the final few items ın 
free recall (Richardson & Baddeley, 1975), impairs recall of order (Murray, 1967; Healy, 
1975; Millar, 1979), eliminates the effect of word length in visually presented lists (Baddeley 
et al., 1975) and removes the advantage of acoustically dissimilar lists (Murray, 1967; 
Healey, 1975). The findings are consistent with blocking of the articulatory loop, but 
Richardson and Baddeley's result implies that more complex memory systems are also 
affected. If noise also suppresses the articulatory loop, it will have similar effects to 
suppression, and combine with suppression to intensify its effects. 

Recently Millar (1979) has examined the effects of both suppression and noise on recall 
of eight consonants and found a complex picture. Total recall, regardless of position, was 
better overall in quiet (75 dBA) than in noisy (92 dBA) conditions but on day 2 of the 
experiment recall in noise improved over the session and in the second half was superior to 
that of the quiet group, whose recall declined over the session. On the first half of the list 
the quieter conditions produced superior recall, while the reverse held on the second half of 
the list (compare Hamilton et al.’s, 1977, result). Recall in the correct serial position was 
superior in noise especially in the suppression conditions. Suppression impaired total recall 
and recall in order. Millar concludes that the results are difficult to account for by the 
hypothesis that noise suppresses subvocal rehearsal, since noise improved recall in the 
correct order and subvocal suppression impaired it. 

If, on the other hand, noise induces stronger internal speech, it will have effects like those 
produced by subjects articulating aloud the items during presentation and/or retention. The 
effects of this procedure are less clear than those of suppression. Crowder (1970) found that 
active vocalization (own voice) induced worse recall than passive (experimenter's voice) for 
the earlier serial positions in ordered recall of digits presented at 2 per second but both 
were superior to reading; he suggested that active vocalization interfered with rehearsal. 
Murray (1965b, 1966, 1967, 1968) has shown consistently that articulating aloud improves 
recall (especially of later items in the list) compared with reading, and this effect increases 
with the loudness of the articulation. Tell & Ferguson (1974) found that articulating aloud 
improved recall in the Brown-Peterson paradigm. Murray (1965 b, 1966) found the benefit 
of articulating was greater at faster presentation rates (his rates varied from 1 item/s to 
4 items/s). Folkard (1976, Fig. 2), however, found that articulation impaired recall of the 
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earlier part of a list of nine digits, but did not report if the difference was significant. The 
items were presented at 1/s and the reason for the difference between this result and the 
other data is unclear. 

Fischler et al. (1970) found that, as the restrictions placed upon rehearsal in free recall 
became more severe (from silent rehearsal, to free vocal rehearsal to rehearsal of only the 
current item), performance deteriorated. Maisto et al. (1977) found that, in a free recall 
task, having to say each word aloud three times when it appeared impaired recall 
performance, but only when subjects expected recall, not when they expected a recognition 
test. Folkard & Monk (1979) also found that articulation impaired free recall for earlier 
serial positions in the list. In both these free recall experiments the rate of presentation was 
slow (one item/3 s and one item/2:5 s) and either the task differences or the slow 
presentation rate could have been responsible for the differences between these results and 
those of Murray. The overall picture therefore seems to be that articulation aids ordered 
recall of lists presented at a fairly fast rate. On the other hand articulation impairs free 
recall of items presented at a slow rate. Whether the task or the rate is the critical variable 
is not certain on the available evidence but Murray's results suggest slow presentation rates 
reduce the benefit of articulation in ordered recall. If noise acts by increasing the strength 
of subvocal speech, it should have similar effects and have small or no additional effects to 
those of articulating aloud when the two are combined (since it is assumed that articulating 
aloud produces a strong memory trace). : 


Experiment 1 
Method 


Order recall was tested without requiring item recall using the method of Healy (1975, 1977), whereby 
a fixed set of letters is presented in varying order from trial to trial. Healy used four letters but the 
present experiments used two sets of five letters each, varying in acoustic similarity. Five letters were 
used because in most conditions ceiling effects occurred with four items. The independent variables 
were noise level (65 or 85 dBC), subjects being randomly assigned to one or other level, articulatory 
suppression (suppression or no suppression), acoustic confusability of the letter set (confusable or 
non-confusable), recall delay (short or long, with 3 or 16 intervening digits before recall), rate of 
presentation (fast or slow, with j s or 2 s between the onset of each item, items always appearing for 
1s). The last variable was included both because the extended trace hypothesis of Daee & Wilding 
implies that noise effects may vary with presentation rate, perhaps interfering at fast rates and 
improving performance at slow rates, and because of the indications already described that rate may 
be an important variable, at least in experiments using articulation aloud if not in those using 
suppression. Subjects experienced all combinations of the last four variables, three trials at each 
combination. 


Subjects. Twenty subjects took part, ten in each noise condition. They were students and 
postgraduates from all departments of the college. 


Materials The letters were presented on the VDU of a Commodore PET microcomputer, and were 

4 mm high and viewed at a distance of approximately 0-5 m. The confusable set was CDGTP and the 
non-confusable set was HIMRZ. Twenty-four sequences of each set were generated randomly, all 
different, with meaningful sequences replaced. The keys corresponding to these letters were colour 
coded red and brown respectively to indicate the two sets and aid responding. Subjects were 
prevented from seeing the keyboard and VDU simultaneously by a screen, to prevent recoding of the 
stimuli in terms of the keyboard layout (i.e. spatially) as they were presented. After the stimuli had 
been presented on each trial, the subject could lean forward and see the keyboard while entering the 
recall sequence on it. 


Procedure. Subjects were instructed as follows: 
This is an experiment on the ability to remember short sequences of letters. On each trial you 
will be shown a sequence of five letters one at a time, which will be either CDGTP or HIMRZ 
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in a different order each time. These two sets of letters are shown in red and brown respectively 
on the keyboard. 

Your task is to remember the order of the letters. 

After the letters have appeared a series of ‘1’s’ will appear, either three of them or 16, one at a 
time. Sometimes there will be one other digit somewhere in the sequence. At the end of the 
sequence an asterisk will appear on the screen. Please type how many digits greater than ‘1’ 
were in the sequence (one or none), and then in response to the question mark which will 
appear, type in the sequence of letters as you recall them. Carefully check that the letters on the 
screen are the ones you intended then press the ‘Return’ key. There is then a short pause before 
the next trial. 

There will be 48 trials, divided into four blocks of 12 trials each. At the beginning of each 
block the computer will instruct you either ‘articulate’ or ‘don’t articulate’ and will tell you 
whether to expect ‘Fast presentation’ or ‘Slow presentation’. ‘Articulate’ means you should 
continue saying ‘the, the, the...’ at a steady rate continuously from the time the letter sequence 
starts until the asterisk appears. ‘Don’t articulate’ means you should keep quiet. Fast 
presentation is half a second between letters, slow presentation is two seconds between letters. 

Please wear the headphones throughout the experiment. 

Before each sequence of letters starts you will hear a hissing noise through them which will 
continue until the letter sequence ends. 

The sequence was then resummarized and a block of 12 practice trials run. 

The noise came on 2 s before the onset of the first letter and continued until the onset of the 
asterisk which indicated that recall was required. The order of the four blocks, and of the 
combinations of conditions within each block, were randomized for each subject. Each block 
contained three trials for each confusability x delay combination. The intervening task before recall 
was designed to hold attention but not to prevent rehearsal completely in non-suppression 
conditions. In fact the target digit, if it occurred, always occurred in the second half of the digit series, 
though subjects were not informed of this. 


Results 


The results consisted of probability of a correct response for each of the five positions in 
the 16 conditions given to each subject (suppression x response delay x rate of 
presentation x acoustic similarity). They were subjected to a split-plot analysis of variance; 
all effects were tested against their interaction with subjects, or, in the case of effects 
involving noise, against the corresponding 'Subjects within noise' term. 

The effects of suppression (F — 85-5, d.f. 1, 18, P « 0-001), rate of presentation 
(F = 37:54, d. f. 1, 18, P < 0:001), acoustic confusability (F = 7-09, d.f. 1, 18, P < 0-025) 
and serial position (F = 67-95, d.f. 4, 72, P < 0-001) were all significant, performance being 
worse with suppression, at the fast rate of presentation, with high acoustic confusability 
and in the middle positions in the list. However there were a large number of interactions, 
most of which were subsumed under the five-way interaction of suppression x rate of 
presentation x recall delay x acoustic confusability x serial position (F = 3:34, d.f. 4, 72, 

P < 0-025) which is shown in Fig. 1. Suppression reduced performance on confusable and 
non-confusable lists to the same level by depressing the latter more and this effect was 
greater at serial positions other than position 1, at the fast rate of presentation and with the 
long recall delay. 

The remaining significant effects involved noise and were as follows. Noise at 85 dBC 
compared with 65 dBC improved performance on confusable lists for all serial positions 
except the first in the non-suppression conditions, but had no effect in the suppression 
conditions (Fig. 2). As a result the noise x acoustic confusability x serial position interaction 
was significant (F — 2-78, d.f. 4, 72, P « 0-05) and so was the noise x acoustic 
confusability x suppression interaction (F — 6-07, d.f. 1, 18, P « 0-025), but not the 
four-way interaction of all these variables (F — 1-56). 
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Figure 1. Mean probability of correct response at each serial position for acoustically confusable (A) 
and acoustically non-confusable (6) letters in the different conditions in Expt 1. 


Discussion 
Noise and suppression had different effects. Suppression impaired performance on all lists, 
the non-confusable ones more than the confusable, presumably by suppressing an 
articulatory loop or other system in which confusions occur. These effects were greater 
when there was less time to process the input and when there was a longer delay (filled 
with suppressing activity) before recall. Item 1 in the list was relatively immune to these 
effects. This is all compatible with what would be expected from impairment of an 
acoustically based or articulatory recirculation process. The absence of a noise effect in the 
suppression conditions is also consistent with this view since the AL is assumed to be 
inactive in these conditions. It is unclear in this case why Millar got significant noise effects 
in suppression conditions. 

Noise had the opposite effect to suppression, improving performance on the acoustically 
similar stimuli. Millar (1979) found that noise improved recall in order of the last four or 
five items in an eight-item list (a result also found by Hamilton et al., 1977) and produced 
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Figure 2. Mean probability of a correct response for acoustically confusable (A) and acoustically 
non-confusable (@) letters in the non-suppression and suppression conditions in Expt 1 at each noise 
level. 


less acoustic confusions than a quiet condition, though the latter difference was not 
significant. Both these findings agree with the present result. The results suggest that noise 
does not just increase the use of the articulatory loop relative to the other possible memory 
or coding systems, since this would increase the number of acoustic confusions, but that it 
improves the quality of information in the loop. One way this could happen is by longer 
trace survival, but noise did not interact significantly with either rate of presentation or 
response delay, though the extended trace theory suggests that noise should be more 
beneficial at slower rates and longer delays. 

The view that noise acts by increasing the strength of inner speech could also explain the 
obtained results, so a further experiment was carried out to test whether articulating aloud 
has the same effect as noise. 

Before turning to Expt 2, Folkard's (1976) suggestion that high arousal, including that 
induced by noise, reduces subvocal activity must be reconsidered in the light of the evidence 
that noise and subvocal suppression have very different effects. Folkard gave five types of 
experimental evidence for his conclusion. 

(1) Muscle tension and noise improve colour naming but impair word naming in a 
Stroop task (colour names printed in different colour inks). Folkard suggests this implies 
reduced subvocalization in high arousal, but it could equally well imply impaired semantic 
processing in the way suggested by Schwartz (1975). 

(2) In recall of nine digits, presumably in order, muscle tension adversely affected 
performance only when subjects were free to rehearse, not when they were articulating or 
suppressing. Folkard argues that muscle tension must suppress subvocal rehearsal, bringing 
performance down to the level of the suppression group. However the instructions to 
articulate aloud did not counteract the supposed effect of muscle tension, as they 
presumably should if Folkard's suggestion is correct, and in the articulation conditions 
performance was slightly worse than in the condition where rehearsal was allowed and 
muscle tension was present. This suggests that articulation is not just making overt the 
activity of the articulatory loop but also interfering with other processes. As has already 
been pointed out, however, impairment due to articulation is not the normal finding in this 
type of task (Murray, 19655, 1966, 1968 got improvement), so it may be unwise to base 
any conclusions on the Folkard result. 

(3) In free recall, time of day affected performance only in a condition where rehearsal 
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was permitted, not when suppression was present. Folkard suggests that afternoon subjects 
do not normally rehearse as actively as morning subjects, but neither group can rehearse 
under suppression so both are reduced to the same level of performance. The results could 
be explained by reduced ability to organize the input using semantic cues in the more 
highly aroused afternoon group and under suppression ability to organize in this way is 
reduced in both groups. Such an explanation, however, is not consistent with the results of 
other experiments by Folkard (1979), which showed morning subjects impaired by acoustic 
similarity and prevention of rehearsal more than afternoon subjects in the sequential recall 
of words, and the reverse result for semantic similarity. These findings suggest that the 
supposed increases in arousal due to noise and time of day have opposite effects on 
semantic organization. However morning superiority has not always been found, even with 
ordered recall; Jones et al. (1978) obtained afternoon superiority, especially on the last few 
items with visual but not acoustic presentation or with slow presentation. None of these 
effects is obviously consistent with the suggestion that morning subjects engage in more 
maintenance processing. 

(4) In a running memory span task subjects were required to recall the last 10 digits 
from a list of uncertain length. A morning group was adversely affected as the rate of 
presentation increased from 1 item/s to 3 items/s, while an afternoon group was unaffected. 
Folkard suggests that the morning group’s more active strategy becomes increasingly 
difficult as the rate increases. However this group was worse at ail the rates used and unless 
it is assumed that the active strategy requires an even slower rate, at which these subjects 
would perform better than the afternoon group, this is difficult to explain when they are 
supposed to be more actively engaged in the task. At the slowest rate the afternoon group 
was in fact inferior to the morning group in recalling the early items from the list, but 
superior on the later items; as rate of presentation increased the early item superiority of 
the morning group vanished due to a decline in morning group performance. Parallel 
results from studies using noise (Hamilton et al., 1977; Millar, 1979), where high noise 
produced inferior recall of early items and superior recall of late items, were cited earlier. 

It is now necessary to consider other explanations of these findings. Hamilton et al. 
(1977) suggested that in states of high arousal faster throughput of information occurs, 
giving increased efficiency on tasks requiring rapid handling of continuous input, but 
poorer memory registration. However this explanation of the effects of arousal cannot 
explain the frequent finding that long-term memory is superior after learning under high 
arousal. Though reduced semantic organization could be explained by reduced semantic 
processing in the aroused fast-throughput system, increased sequential organization 
throughout long lists of words could not. Without more precise specification of the nature 
of the errors on earlier items it is difficult to suggest a satisfactory alternative explanation, 
but the extended trace hypothesis (Dace & Wilding, 1977) could handle the result in the 
following way. In a running memory span task groups of items are taken as they arrive and 
placed in the AL. As new groups arrive the earlier items have to be transferred to the 
*overflow system' in working memory to enable the new information to be held in the AL. 
If high arousal increases the duration of traces, the information in AL (the most recent 
items) is superior in the aroused state, while that in working memory is impaired by the 
presence of traces surviving from still earlier items. This explanation predicts that intrusions 
from still earlier in the list cause the deterioration on the earlier recalled items in this task. 
However, Millar (1979) obtained the same result with eight-item lists when there are no 
earlier items from the same list; in his experiment subjects were tested on 120 lists in a 
session and the tendency for early items to be recalled less well and later items to be 
recalled better under noise increased as the experiment continued, which suggests that 
interference from preceding lists on items in working memory may have been responsible. 
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The effect of rate of presentation in Folkard’s running memory span experiment can be 
incorporated in the explanation as due to the survival of more unwanted items from still 
earlier in the list in the afternoon (high arousal) group at all rates and in the morning 
group at faster rates. 

(5) In solving syllogisms, subvocal activity, as rated by observers from lip movements, 
decreased in noise. This could, however, be due to greater internalization of the activity to 
insulate it from the noise, or to overloading of working memory in higher arousal, reducing 
its ability to control subvocalization. 

In conclusion of this discussion, it is suggested that Folkard’s earlier data do not compel 
the conclusion that high arousal reduced activity in the articulatory loop and can equally 
well be explained in ways which are compatible with the view that high arousal increases 
dependence on the articulatory loop and reduces dependence on more complex memory 
systems. The recent (Folkard, 1979) experiments, however, do establish his claim more 
firmly and it must be concluded either that time of day and noise have opposed effects on 
semantic organization and should not be regarded as two equivalent methods of varying 
arousal, or that the differences are due to differences in the range of variation in arousal in 
the two cases, combined with a non-monotonic relation between arousal and performance. 


Experiment 2 


Experiment 2 investigated whether articulating aloud had similar effects on memory for 
order to those produced by noise. All details were as in Expt 1, except that instead of being 
asked to suppress inner speech, subjects were instructed, ‘Articulate means you should read 
all the letters aloud as they appear on the screen and continue to repeat them aloud until 
the asterisk appears’. None of the subjects had taken part in Expt 1. 


Results 


The effects of noise (F = 4-63, d.f. 1, 18, P < 0-05), acoustic confusability (F = 104-59, d.f. 
1, 18, P < 0-001), and serial position (F = 59-8, d.f. 1, 18, P < 0-001) were significant. 
Noise improved performance, acoustic confusability impaired it, and performance was 
worse on the middle positions in the list. These variables also entered into several 
interactions. Articulation impaired performance on confusable lists but improved it on 
non-confusable ones (F = 7-15, d.f. 1, 18, P < 0-025). Articulation also impaired 
performance at the slow rate of presentation but improved it at the fast rate (F = 21-65, d.f. 
1, 18, P < 0-001). Though the triple interaction of these three variables was not significant 
(F = 2:29, d.f. 1, 18) inspection of the means showed that at the fast rate articulation 
slightly improved performance in all conditions, but the main effect of articulation was a 
marked impairment of performance at the slow rate with acoustically confusable stimuli. 
This was the source of the impaired performance under articulation at the slow rate and 
with the confusable stimuli. Noise entered into two triple interactions, noise x acoustic 
confusability x articulation (F — 4:57, d.f. 1, 18, P « 0-05) and noise x acoustic 
confusability x rate of presentation (F — 8:86, d.f. 1, 18, P « 0-01), and one four-way 
interaction, noise x acoustic confusability x articulation x serial position (F — 3:69, d.f. 4, 72, 
P « 0:01). However all these interactions seemed to be due to ceiling effects, with noise 
having little effect on the condition in which performance was already near to the 
maximum possible (the non-confusable lists with articulation in the first case and the 
non-confusable lists at the fast rate of presentation in the second). The four-way 
interaction was also explicable as due to ceiling effects; with non-confusable items and 
articulation the serial position effect also vanished and performance on the middle items 
was as good as that on the end items, so there was no differential effect of noise (in fact no 
effect of noise) on different serial positions in these conditions, in contrast to conditions 
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involving confusable and/or non-articulated items. There was one other significant 
interaction of acoustic confusability and serial position, with more confusable items giving 
a more pronounced serial position curve (F = 25-66, d.f. 4, 72, P < 0-001). Figure 3 shows 
the interaction of noise, acoustic confusability and articulation and Fig. 4 shows the 
interaction of articulation, rate of presentation and acoustic confusability. 
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Non-articulation Articulation conditions 
conditions 


Probability of a correct response 





65 85 65 85 
Noise level (dBC) 


Figure 3. Mean probability of a correct response for acoustically confusable (A) and acoustically 
non-confusable (@) letters in the non-articulation and articulation conditions in Expt 2 at each noise 
level. 
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Figure 4. Mean probability of a correct response for acoustically confusable (A) and acoustically 
non-confusable (69) letters in the non-articulation and articulation conditions in Expt 2 at each rate 
of presentation. 


Discussion 

The improvement under the higher noise level found in Expt 1 was also found in Expt 2, 
but this time it occurred with non-confusable as well as confusable lists in the 
non-articulation conditions (which were identical with the non-suppression conditions 

of Expt 1). The difference between the two experiments seems to lie mainly in the high level 
of performance in the 65 dB condition with non-confusable items in Expt 1. 
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Articulation improved performance on non-confusable lists and impaired it on 
confusable ones, and it impaired performance at the slow rate of presentation while 
improving it at the fast rate. However, as pointed out above, the major effect was an 
impairment at the slow rate with confusable stimuli. These results suggest that at the fast 
rate articulating the items aloud is an appropriate strategy, consistent with what subjects 
might do internally if left to their own devices. The degree of improvement was small and 
confirmation is needed whether articulation actually improves performance in these 
conditions or is merely non-interfering. At the slow rate, articulation presumably interferes 
with a more appropriate rehearsal strategy such as repeating all preceding items while 
awaiting the next. Acoustically confusable items are the main sufferers when such a i 
strategy is prevented. This is consistent with the view that the traces of items not 
maintained in an articulatory loop decay rapidly and produce traces indiscriminable from 
each other when the items only differ on a few features, and that articulating aloud at the 
slow rate prevents such maintenance. 

There are some problems with this interpretation over the effects of noise. There may be 
some improvement when noise is added to articulation at the fast presentation rate 
(though this needs confirmation) and this is difficult to explain if articulation alone 
maintains an adequate trace. Secondly noise also improved performance when added to 
articulation at the slow presentation rate and this is also difficult to explain if articulation is 
preventing proper use of the AL. Possible explanations are that noise causes louder 
articulation or helps in some other way to maintain discriminable traces, or in the second 
case it may induce additional subvocal rehearsal in the intervals between items. 

The next two experiments were designed to establish more firmly whether articulation 
aided performance at the fast rate of presentation, and whether noise aided performance 
when added to articulation. 


Experiment 3 


The non-confusable stimuli and the longer recall delay were omitted to simplify the design 
(recall delay having produced non-significant effects in Expt 2). Six trials were given in each 
of four conditions, the combinations of the articulation and speed of presentation 
conditions. 

Noise consistently impaired performance but this effect was not significant (F = 3-69, d.f. 
1, 18). Planned comparisons were used to examine the effect of rate of presentation and of 
articulation at each rate. Performance was better at the slow rate (F — 8-82, d.f. 1, 54, 
P « 0-01) and articulation again improved performance at the fast rate (F — 2-93, d.f. 1, 36, 
P < 0-05 on a one-tailed test) and impaired it at the slow rate (F = 5-2, d.f. 1, 36, P < 0-05). 
The means are given in Table 1. 

The difference in the effect of noise in this experiment was tentatively explained as 
follows. In Expt 2 in the articulation conditions subjects tended to complete a rehearsal 


Table 1. Mean probability of correct recall at each position in the non-articulation and 
articulation conditions at each presentation rate in Expt 3 


Presentation rate (s per item) 











Non-articulation 0-55 0:73 
Articulation 0-63 0-63 
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cycle through the list before recall and even when recall was required after only three digits 
they often went on to complete the cycle before recall. It is assumed that in the 
non-articulation conditions a similar strategy operated. In Expt 3, however, the interval 
before recall was short and fixed so this strategy was discouraged and subjects only 
completed one cycle through the list as it was presented. Thus the process which is aided in 
noise was impaired. Expt 4, therefore, restored the variable delay, but again tested only the 
confusable stimuli, giving three trials in each of eight conditions. 


Experiment 4 

Noise aided performance 1n this experiment. Planned comparisons showed that in the 
non-articulation conditions the difference was significant (F = 4-2, d.f. 1, 36, P « 0-05) but 
not in the articulation conditions (F — 1:45, d.f. 1, 36). Noise interacted significantly with 
recall delay (F = 11-5, d.f. 1, 18, P < 0-01), having a strong effect at the short delay and no 
effect at the long delay. There was a non-significant effect in this direction in the 
corresponding conditions in Expt 2. There was also in the present experiment a significant 
effect of recall delay (F — 12:02, d.f. 1, 18, P « 0-01) with performance better at the long 
delay. Figure 5 shows these results. 

The same planned comparisons as in Expt 3 showed that articulation improved 
performance at the fast presentation rate (F — 3-86, d.f. 1, 36, P « 0-05 by a one-tailed 
test), and impaired it at the slow rate (F — 11-2, d.f. 1, 36, P « 0-01). Figure 6 shows this 
result. 

There were also two significant fourth-order interactions, articulation x rate of 
presentation x response delay x noise (F — 5-43, d.f. 1, 18, P « 0-05) and articulation x rate 
of presentation x serial position x noise (F — 6:55, d.f. 4, 72, P « 0-001). The first offered no 
clearly interpretable pattern, and the second appeared to be due to a deleterious effect of 
noise on the second half of the list when articulating at the slow presentation rate. Since 
neither of these interactions appeared in Expt 2, they will not be discussed further. 
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Figure 5. Mean probability of a correct response for the short (O) and long (W) response delay in the 
non-articulation and articulation conditions in Expt 4 at each noise level. 


In summary Expts 3 and 4 confirmed that articulation aids performance at the fast rate 
of presentation, so in these conditions its effect is in the same direction as noise. 
Experiment 4 confirmed the beneficial effect of noise in this task but failed to show a 
significant effect of noise added to articulation, so the question raised previously as to how 
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Figure 6. Mean probability of a correct response in the articulation (O) and non-articulation (A) 
conditions in Expt 4 at each rate of presentation. 


noise could improve performance in these conditions does not arise. However the failure to 
obtain beneficial effects of noise in Expt 3 suggests that noise effects are sensitive to other 
factors in the task, possibly through the ways in which these affect the type of memory 
strategy adopted. 


General discussion 


The results of these four experiments are compatible with the view that loud noise induces 
greater reliance on the articulatory loop which stores items in their order of arrival, thus 
improving recall of order. Suppression of internal articulation eliminated the advantage of 
performing the order recall task in 85 dBC white noise. Articulating the items aloud as they 
arrive, and continuing to do so until recall, improved performance provided the rate of 
presentation was fast, though the improvement was small. 

The main difference between noise and articulation effects was that articulation impaired 
performance at the slow rate of presentation while noise improved it; it was suggested that 
this is because the articulation instructions prevented the best strategy at this rate, while 
noise left subjects free to adopt the best internal strategy. The apparent difference between 
noise and articulation in their interaction with acoustic similarity, whereby noise improved 
performance on acoustically similar items only (Expt 1) while articulation impaired 
performance mainly on such items (Expt 2) was also due to articulation effects operating 
only at the slow rate of presentation and may therefore be explicable in the same way. 

Putting this conclusion another way, noise and articulation encourage maintenance 
rehearsal and not elaboration rehearsal. Mueller et al. (1978) have reached a similar 
conclusion concerning high anxiety and Folkard (1979) argues that increased arousal over 
the day has the reverse effect. This view implies that noise will have no effects on memory 
for visual patterns which are not verbally codeable and experiments we have done so far 
using visual matrix patterns have failed to show any effects of noise, supporting the view 
that noise affects only material which is coded in an articulatory-acoustic form in the AL 
There are wider implications of this conclusion since the evidence discussed earlier suggests 
that both noise and articulation impede the reorganization of information using semantic 
features, and hence may reduce the availability of semantic information for subsequent 
Cognitive processes. 
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Effects of processing depth, distinctiveness, and word frequency on retention 
Michael W. Eysenck and M. Christine Eysenck 





The hypothesis that retention differences between semantic and phonemic encodings are in part 
attributable to greater distinctiveness of semantic encodings was investigated in two experiments. The 
memory-enhancing effects of instructions designed to increase trace distinctiveness were much greater 
with phonemic than with semantic processing. The usual inverse relationship between word frequency 
and recognition was reduced by instructions designed to increase trace distinctiveness, suggesting that 
the customary advantage of rare over common words may be partially due to greater encoded 
distinctiveness of rare words. 





There is substantial evidence that the semantic characteristics of presented verbal material 
are better retained than the phonemic characteristics. At the theoretical level, a major 
attempt to interpret this finding within the framework of a general model of memory was 
that of Craik & Lockhart (1972). In essence, they argued that the greater longevity of 
semantic than phonemic information was due to the greater depth of processing 
necessitated by semantic analysis. 

One major criticism of the Craik-Lockhart position is that there is no independent 
criterion of processing depth (Eysenck, 1977, 19784, b, 1979; Nelson, 1977). A further 
difficulty is that semantic and phonemic processing of verbal material differ in several ways 
other than depth of processing. Of most immediate relevance, successful memory 
performance often depends heavily upon context-specific encoding at input. Since the 
to-be-remembered words are usually known by the subject, he is only able to discriminate 
subsequently between the to-be-remembered words and the other words he knows on the 
basis of ancillary information stored with each word. Jacoby (1974) claimed, with 
supporting evidence, that phonemic encoding is relatively invariant across different 
situations, whereas semantic encoding is context dependent. Since the semantic encoding of 
a given word in a given situation is different from the semantic encoding of that word in a 
different situation, the experimental encoding of a word is discriminable from prior 
encodings of the same word. This trace discriminability may be relatively lacking in the 
case of phonemic encoding. 

A major objective of this study was to explore the notion that superior memory 
performance for semantic compared to phonemic information may be due in part to the 
fact that semantic encoding of a word usually produces a distinctive trace that is 
discriminable from previous semantic encodings of that word, whereas phonemic encoding 
does not. In other words, it is argued that processing distinctiveness is usually confounded 
with processing depth. Trace distinctiveness is determined importantly by at least two kinds 
of factors: (1) the similarity relations among the encodings of a to-be-learned set of items; 
and (2) the similarity relations between the intra-experimental encoding of an item and its 
pre-experimental encodings. The emphasis of this study is primarily on the second type of 
similarity relation. The multidimensionality of trace distinctiveness must be recognuzed, e.g. 
a trace may be distinctive semantically but not orthographically, and vice versa. 

As has been pointed out by Eysenck (1979), a distinctiveness account is vulnerable to 
many of the same criticisms as a depth interpretation, especially with respect to the absence 
of an appropriate independent measure of the major explanatory concept. Nevertheless, the 
distinctiveness approach seems potentially to offer a more adequate theoretical account of 
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the data than does the depth approach; it is reasonable to argue that an interp 
the superior retention of semantic encodings compared to physical encodings i 
the greater depth of processing of semantic encodings is otiose, being totally u 
with respect to the processes and mechanisms involved. 

A plausible mechanism whereby trace distinctiveness might affect retention v 
by Lockhart et al. (1976): ‘If there are many traces of the pattern (induced by 
stimulus), a “familiar” encoding will be achieved easily (since it is guided by n 
previous traces and thus easily encoded), but the stimulus will not elicit recogn 
specific previous instance (since many different contexts are competing for con: 
awareness). If the pattern was unique or distinctive, however, only one or a fe 
such a pattern exist — now if the retrieval information (the test stimulus) is spe: 
the previous trace will be contacted and the present encoding will contain cont 
features from the initial encoding' (p. 83). 

There is evidence in the literature (e.g. Jacoby, 1974; Moscovitch & Craik, 1 
retention of phonemic encodings is poor and is unaffected by attempted manip 
trace distinctiveness, presumably because phonemic encodings are non-distincti 
circumstances. Indeed, Moscovitch & Craik (1976) concluded that phonemic ei 
necessarily poorly remembered: ‘The event must be discriminable and unique : 
before retention is enhanced" (p. 457). In the first experiment, processing instrt 
manipulated in an attempt to produce distinctive phonemic traces. The usual r 
superiority of semantic over phonemic encodings should be attenuated or even 
distinctive phonemic encodings are produced via processing instructions. 

A further factor included in the first experiment was word frequency. It has 
found (cf. Gregg, 1976) that, while common words are, if anything, better recal 
words, rare words are better recognized. It has been suggested (e.g. Brown, 19' 
words are well recognized because they produce distinctive encodings, in the se 
there are (by definition) relatively few prior encodings of such words. If rare w 
usually more distinctively encoded than common words, a distinctiveness-enha: 
manipulation produced by varying processing-task requirements might have a : 
effect on rare and common words. More specifically, words usually receiving 
non-distinctive encodings (i.e. common words) should benefit to a greater exter 
words receiving distinctive encodings (i.e. rare words) from an attempt to incre 
distinctiveness. 


Experiment 1 
Method 


Materials. The Concise Oxford English Dictionary was searched for nouns with irregula: 
phoneme correspondence. Examples of such words are ‘glove’, which would rhyme witl 
had regular grapheme-phoneme correspondence, and 'comb', which has a silent final le 
of 240 such nouns was uncovered, and a median split on the basis of the Thorndike-Lc 
count (1944) produced 120 rare words (mean frequency = 5-09 per milion) and 120 com 
(mean frequency in excess of 60 per million). 


Subjects. There were 20 subjects, mostly students at the University of London. They we 
the 18-30 years age range. They received modest payment for their services. 


Design and procedure. On the presentation trials, each subject encountered a different ra 
of 96 words (48 common and 48 rare) selected from the total item. pool. Twelve words i 
and six rare) were presented in each of four instructional conditions; this procedure wa: 
once for each subject. In the phonemic non-distinctive condition, subjects were required 
gach word overtly with its usual pronunciation; in the phonemic distinctive condition, t 
asked to pronounce each word as if it had regular grapheme-phoneme correspondence; 
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semantic non-distinctive condition, they were told to produce a descriptor (e.g. adjective) that 
typically modified each noun; and in the semantic distinctive condition, subjects were instructed to 
produce a descriptor that could be, but infrequently is used as a modifier for each noun. 

Initially, each subject was told that the experiment was investigating the speed with which certain 
tasks could be performed. The nature of the four tasks was explained, and examples of each kind of 
processing given. The subject then sat down in front of a tachistoscope, and he was equipped with a 
throat microphone. On each trial, the experimenter gave a warning signal 2 s before the presentation 
of the word. The duration of the exposure was 0-5 s. The subject was told to perform the designated 
task as rapidly «s possible, and to respond overtly. 

The 96 trials were divided into eight blocks of 12 trials each. Within each block, six common and 
six rare words were presented. Subjects received two blocks of trials under each of the four 
instructional conditions. Assignment of words to blocks and the ordering of the blocks were 
randomly determined The subjects were told orally before the start of each block the kind of 
processing task which was to be performed on the words in that block. 

Five minutes after list presentation, half the subjects unexpectedly received a test of free recall; the 
other half received a recognition test. The retention interval was filled with an irrelevant non-verbal 
task. On the recall test, subjects were given unlimited time to write down ail the presented words they 
could remember On the recognition test, subjects were given sheets containing the entire pool of 240 
words in random order and had to select exactly 96 words which they believed had appeared. 

In sum, the experiment comprised two within-subjects factors: processing conditions and word 
frequency. There were three dependent variables of interest: processing time on the presentation 
trials, recall performance and recognition performance. 


Results 

Processing time. The first analysis was based upon the reaction time data given in Table 1. 
Since all of the subjects were treated identically until the retention test, the analysis was 
based upon the entire subject sample. The main effect of processing conditions was highly 
significant (F = 17-72, d.f. = 3, 57, P < 0-001) but there was no effect of word frequency, 
either as a main effect (F = 1:79, d.f. = 1, 19) or in interaction with processing conditions 
(F = 1-02, d.f. = 3, 57). This contrasts with the usual finding (e.g. Berry, 1971; Forster & 
Chambers, 1973) that common words are named more rapidly than rare words. While the 
reasons for this difference are not clear, it may be relevant that word frequency was more 
rigorously manipulated in these other studies. 


Table 1. Processing time as a function of processing conditions and word frequency (Times 
are in seconds. Standard deviations are in parentheses. Experiment 1) 





Processing conditions 








Phonemic Phonemic Semantic Semantic 

non-distinctive distinctive non-distinctive distinctive 
Common 0-74 (0-24) 1-52 (0-49) 2-79 (1:11) 4-77 (3-15) 
Rare 0-82 (0-26) 1-51 (0-42) 3-11 (1-56) 5-13 (3-66) 


The ordering of the four processing conditions from fastest to slowest was as follows: 
phonemic non-distinctive; phonemic distinctive; semantic non-distinctive; and semantic 
distinctive. All group differences were significant at beyond the 0-001 level. Thus the data 
indicate that the semantic tasks were more time-consuming than the phonemic tasks, a 
finding previously obtained several times (e.g. Craik, 1973). In addition, processing time 
was longer under distinctive processing conditions at both the phonemic and semantic 
levels of analysis. 
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Recognition. Analysis of the recognition data was based upon the number of correct choices 
or hits in the various conditions. The summary data are given in Table 2. The interaction 
between processing conditions and word frequency was significant (F = 3-40, d.f. = 3, 27, 
P « 0-05). Analysis of the simple main effects indicated that rare words were better 
recognized than common words in the phonemic non-distinctive condition (F = 16-82, 

d.f. = 1, 36, P « 0-001) and in the semantic non-distinctive condition (F = 5:12, 

d.f. — 1, 36, P « 0-05) but there was no word frequency effect in the phonemic distinctive 
condition (F « 1, d.f. — 1, 36) or in the semantic distinctive condition (F « 1, d.f. — 1, 36). 


Table 2. Number of items correct in recognition and in recall (standard deviations are in 
parentheses. Experiment 1) 








Phonemic Phonemic Semantic Semantic 

non-distinctive distinctive non-distinctive distinctive 

Rare Common Rare Common Rare Common Rare Common 
Recognition 9 50 6:60 10-40 9 90 11-20 9-60 10-80 10-50 
memory (1-44) (2:12) (1:39) (1-29) (0-79) (1:27) (1-03) (0-97) 
Recall 1-60 0-80 1-70 3-30 310 3-30 2 60 3-00 
memory (1-58) (1:31) (1:34) (1-77) (1-97) (1-25) (2:07) (1-56) 


Overall, there was a significant main effect of word frequency (F = 9-49, d.f. = 1, 9, 

P < 0-025) with rare words being better recognized than common words. This replicates 
the typical inverse relationship between word frequency and recognition-memory 
performance (Gregg, 1976). The main effect of processing conditions was also significant 
(F = 33-08, d.f. = 3, 27, P < 0-001). The data of most interest concern the phonemic 
distinctive condition. Recognition performance in this condition was non-significantly 
different from the semantic non-distinctive condition (1 = 0-86, d.f. = 9) and from the 
semantic distinctive condition (t = 2-04, d.f. = 9). However, performance in the phonemic 
distinctive condition exceeded that in the phonemic non-distinctive condition (t — 7-92, 
d.f. = 9, P < 0-001). Further investigation of this difference indicated that common words 
were better recognized in the phonemic distinctive condition than in the phonemic 
non-distinctive condition (t = 5:32, d.f. — 9, P < 0-001) but rare words showed comparable 
recognition performance in both conditions (1 = 1:45, d.f. = 9). 

One possible interpretation of the non-significant effect of processing distinctiveness with 
semantic tasks is simply that semantic processing did not differ in the semantic 
non-distinctive and semantic distinctive tasks. This possibility was explored by means of 
independent ratings of the distinctiveness or typicality of the words produced in these two 
tasks in relation to the target words, the ratings being done on a five-point scale running 
from very typical (1) to very atypical (5). These ratings were done by three additional 
subjects. The mean rating across subjects was 3-88 for words produced in the semantic 
distinctive condition and 2-18 for words produced in the semantic non-distinctive task 
(t = 12:14, d.f. — 9, P < 0-001). The ratings for rare and common words did not differ 
significantly. 

Inspection of the recognition data suggests the existence of ceiling effects. This possibility 
was explored by reanalysing the data including the additional factor of learning ability. In 
this analysis, the five subjects having the highest overall recognition scores were designated 
‘good learners’ and the five having the lowest overall recognition scores were deemed 
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‘poor learners’. While there was a substantial main effect of learning ability (F = 37-38, 
d.f. = 1, 8, P < 0:001) learning ability did not interact with either of the other factors. The 
implication is that the obtained pattern of results in the main analysis was not dependent 
on an overall high level of recognition. 


Recall. Analysis of the recall data was based upon the number of items correctly recalled in 
each of the conditions. The summary data are given in Table 2. There was a significant 
interaction between processing conditions and word frequency (F — 3:37, d.f. — 3, 27, 
P < 0-05). Analysis of the simple main effects indicated that the only processing condition 
to show an effect of word frequency was the phonemic distinctive condition, in which 
common words were better recalled than rare words (F — 6:67, d.f. — 1, 36, P « 0-025). 
The main effect of word frequency was non-significant, but there was an effect of 
processing conditions (F = 7-10, d.f. = 3, 27, P < 0-001). Recall was higher in the 
phonemic distinctive condition than in the phonemic non-distinctive condition (f — 4-64, 
d.f. — 9, P « 0-001) and in the semantic distinctive condition than in the phonemic 
distinctive condition (t = 2:50, d.f. = 9, P < 0-05) but there was no difference between the 
phonemic distinctive and semantic non-distinctive conditions (t = 0-55, d.f. = 9). Further 
investigation of the difference between the phonemic non-distinctive and phonemic 
distinctive conditions indicated that common words were better recalled in the phonemic 
distinctive condition than in the phonemic non-distinctive condition (t — 5-00, d.f. — 9, 
P « 0-001) but rare words showed comparable recall performance in both conditions 
(t = 0:24, d.f. = 9). 


Discussion 


The results of this experiment confirm the notion (Craik & Lockhart, 1972) that the depth 
to which information is processed is an important determinant of subsequent memory 
performance. For both recall and recognition, the overall level of performance for those 
items that had received a semantic processing task was substantially higher than for those 
that had received a phonemic processing task. This finding has been obtained several times 
previously (e.g. Craik, 1973; Craik & Tulving, 1975). 

However, the main finding of this experiment is that the retention advantage associated 
with semantic levels of analysis over phonemic levels depends on the distinctiveness of the 
phonemic encodings. The crucial condition is the phonemic distinctive condition, which 
presumably produced phonemic encodings that were unique in the subject's experience. In 
this condition, phonemic encodings were recognized as well as semantic encodings, and 
considerably better than conventional phonemic encodings. The results were similar for 
recall, with distinctive phonemic encodings being better recalled than conventional 
phonemic encodings, and recalled at comparable levels to non-distinctive semantic 
encodings. However, distinctive phonemic encodings were less well recalled than distinctive 
semantic encodings. 

Superficially, the finding that distinctive phonemic encodings are as well recognized as 
semantic encodings appears inconsistent with the results obtained by Jacoby (1974) and by 
.Moscovitch & Craik (1976). They found that only semantic encodings were enhanced by 
attempts to increase the encoded distinctiveness of words. However, distinctiveness was 
manipulated in a very different fashion in this experiment than in previous work. Neither 
Jacoby nor Moscovitch & Craik succeeded in producing phonemic encodings of words that 
were substantially different from prior phonemic encodings of those words, whereas unique 
phonemic encodings were produced in the phonemic distinctive condition. It also appears 
that the contention of Moscovitch & Craik that phonemic encodings are necessarily poorly 
remembered is erroneous. 
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The favoured interpretation of these findings is that the study encoding of an item must 
be discriminable from other encodings of that item if it is subsequently to be recalled or 
recognized as having been presented at study. Let us also assume that there are 
considerably more potential semantic encodings of verbal items than there are potential 
phonemic encodings. It would then follow that semantic levels of analysis would usually 
produce more discriminable memory traces than phonemic levels, and this would lead to 
the frequently observed memorial superiority of semantically encoded material. On this 
reasoning, the memorial advantage of semantic over phonemic levels of processing would 
be attenuated or even eliminated provided that it were possible to devise a phonemic 
processing task that produced distinctive traces. The results, of course, supported this 
prediction. 

While there was no effect of manipulating distinctiveness at the semantic level, it should 
nevertheless be possible to obtain such an effect under suitable conditions. One possible 
reason why manipulation of processing typicality failed to affect retention of semantic 
encodings in this experiment is that distinctive processing may direct attention away from 
the defining properties of the presented material. 

An extension of the distinctiveness hypothesis would be the notion that the retention of 
an item depends on (a) the similarity of prior encodings.of that item to the 
intra-experimental encoding; and (b) the number of such prior encodings. In general, 
distinctiveness of the intra-experimental encoding will vary inversely with both the 
similarity and the number of prior encodings. The number of prior encodings of any 
particular word is obviously a function of that word’s frequency of occurrence in a given 
subject’s experience. It follows that the intra-experimental encoding of a rare word will tend 
to produce a more distinctive trace than is the case with a comon word. Consistent with 
this argument, recall and recognition of common words were enhanced by distinctive as 
compared to non-distinctive processing tasks, whereas the retention of rare words was 
unaffected by processing distinctiveness. 

It should be noted that common and rare words undoubtedly differ in ways additional to 
those considered above. In particular, the frequency paradox (Gregg, 1976), in which 
common words are, if anything, better recalled but less well recognized than rare words, 
seems to indicate that there are many complex differences between rare and common 
words. It may well be (e.g. Lockhart et al., 1976) that recall involves a greater emphasis on 
reconstructive activities than does recognition, and that common events are more easily 
integrated into existing cognitive structures and thus more easily reconstructed on the recall 
test. 

An alternative interpretation of the main finding that retention was better in the 
phonemic distinctive condition than in the phonemic non-distinctive condition is simply 
that a greater degree of semantic processing occurred in the former condition. A possible 
reason for this is that the purposeful mispronunciation required in the phonemic distinctive 
condition may be difficult to achieve without some attention being paid to each word’s 
meaning. The most natural prediction from this viewpoint is that both common and rare 
words would show comparable increases in memory performance between the phonemic 
non-distinctive and the phonemic distinctive conditions. In fact, while this was the case for 
common words, there was a non-significant effect on the retention of rare words. Thus one 
advantage of a distinctiveness account is that it predicts this differential effect of processing 
conditions on retention of common and rare words, whereas a semantic interpretation of 
the data does not. 1 

On the other hand, a semantic interpretation can readily account for the finding that the 
effects of distinctiveness were much smaller at the semantic level than at the phonemic 
level, since memory performance could then simply be attributed to the amount of semantic 
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processing involved in each condition, rather than to distinctiveness. In view of the 
importance of this issue, some additional relevant evidence was obtained in Expt 2. 

The results may be of some value in increasing understanding of the inverse relationship 
between word frequency and recognition. We found that there was no word-frequency 
effect in recognition memory when distinctive processing tasks were used. This suggests 
that interpretations of the word-frequency effect in recognition memory in terms of 
objective physical characteristics (Schulman, 1967) or orthographic distinctiveness 
(Zechmeister, 1972) may not prove adequate. It is likely that the most appropriate 
interpretation of the word-frequency effect in recognition memory will focus upon the 
stimulus-as-encoded rather than upon the stimulus-as-presented. It may be that rare words 
are only better recognized than common words when their encoded representations are 
more distinctive. In other words, what is relevant is the rarity of the encoding that a word 
receives rather than the rarity of the word as a presented stimulus. 

It has been suggested (e.g. Kinsbourne & George, 1974) that the superior recognition of 
rare words may be due in part to the fact that rare words receive more elaborate encodings 
than common words. To the extent that the processing-time measure recorded in the 
presentation phase is a valid index of encoding thoroughness, this hypothesis is inconsistent 
with the obtained data. There was no influence of word frequency on processing time, 
either as a main effect or in interaction with other factors. 


Experiment 2 


There are some difficulties in the interpretation of the first experiment. For example, there 
were uncontrolled, and highly significant, differences in processing time among the 
instructional conditions. Although the evidence suggests that the nature of the processing 
task is a much more consequential determinant of retention than is processing time per se 
(Craik & Tulving, 1975; Gardiner, 1974), it seemed desirable to replicate the first 
experiment while equating processing time. 

A second problem concerns the finding that memory performance was much higher 
under the phonemic distinctive than the phonemic non-distinctive condition. A plausible 
interpretation of this finding is that there is more extensive phonemic encoding in the 
phonemic distinctive than the phonemic non-distinctive condition, possibly because of the 
ease of access to the normal phonemic encoding. Craik & Tulving (1975) have obtained 
evidence of the enhancing effect of spread of encoding upon retention. In order to 
investigate this hypothesis, a dual-phonemic task was included in the second experiment. In 
this condition, the subject performed both the phonemic distinctive and phonemic 
non-distinctive tasks on each word. The effects of spread of encoding at the semantic level 
were investigated by means of a dual-semantic task, in which both the semantic 
non-distinctive and semantic distinctive tasks were performed on each word. In addition, 
an attempt was made to investigate the nature of the stored memory trace in each 
condition more directly than was possible in the first experiment. The subjects had to 
decide which of the various encoding tasks had been used for each item that was recalled or 
recognized. 

Method 


Subjects. There were 37 subjects, all of whom were students at the University of London. All the 
subjects were in the age range 25—40 years. They received course credit for their participation. 


Design and procedure. The same item pool as in Expt 1 was used. There were six presentation 
conditions, the four used before (phonemic non-distinctive; phonemic distinctive; semantic 
non-distinctive; and semantic distinctive) and two new conditions (dual-phonemic and 
dual-semantic). In the dual-phonemic condition, subjects were instructed to perform both the 
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phonemic non-distinctive and phonemic distinctive tasks on each word. Similarly, in the 
dual-semantic condition, subjects were told to use both the semantic non-distinctive and semantic 
distinctive tasks on each word. 

Subjects received different random samples of 108 words selected from the total pool of 240 words. 
Eighteen words were used ın each of the six conditions, and subjects were required to spend 10 s 
processing each word. When the subject had completed the designated task on each word, he was to 
reiterate the required processing until the end of the 10 s period, at which time he was to proceed to 
the next word in the booklet whenever the experimenter said ‘Right’. The appropriate processing 
task for the words on each page was indicated at the top of the page. 

Thirty minutes after list presentation, subjects were randomly assigned to a free recall (n — 19) or a 
recognition test (n — 18). The subjects were occupied with an irrelevant task throughout the retention 
interval. In free recall, each subject had unlimited time to write down as many presented items as he 
or she could remember in any order, and to indicate beside each word the type of processing 
associated with each word during presentation. In recognition, subjects received sheets containing the 
entire pool of 240 items and had to select exactly 108 words. Ticks were placed beside selected words. 
All of the subjects selected the required number of words. They had to indicate beside each word 
recognized the type of processing associated with each word during presentation. The retention test 
(whether recall or recognition) was unexpected. The experiment was conducted as a group 
experiment. 


Results 
Recognition. The analysis of the recognition data was based upon the number of items 
correctly recognized in each condition. The summary data are given in Table 3. There was 


Table 3. Number of items correct in recognition and in recall (standard deviations are in 
parentheses. Experiment 2) 


Non-distinctive Distinctive Dual 
Recognition memory 
Phonemic 10-89 . 12:28 11-22 
(2:08) (1:81) (2:26) 
Semantic 15-39 14:72 15:11 
(1:24) (1:76) (1-45) 
Recall memory 
Phonemic 0-58 1-05 1-53 
(0-69) (0-78) (1-02) 
Semantic 2°84 2:16 3-32 
(2-14) (1:57) (1-86) 





a significant interaction between processing depths and processing distinctiveness, F = 5-63, 
d.f. = 2, 34, P < 0-025. Analysis of the simple main effects indicated that processing 
distinctiveness had an effect on the phonemic tasks (F = 9-28, d.f. = 2, 68, P < 0-001) but 
did not have an effect on the semantic tasks (F = 1-98, d.f. = 2, 68). The treater effect of 
processing distinctiveness on phonemic than on semantic tasks replicates in essence the 
findings from the first experiment. A z test indicated that recognition performance was 
better under the phonemic distinctive than the phonemic non-distinctive condition 

(t = 3:56, d.f. = 17, P < 0-01). Overall, the only significant main effect was that of 
processing depth (F — 77-72, d.d. — 1, 17, P « 0-001) with semantic processing producing 
much higher recognition performance than phonemic processing. 


Recall. The analysis of the recall data was based upon the number of items correctly 
recalled in each condition. The summary data are given in Table 3. The recall data showed 
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a large effect of processing depth (F = 36-00, d.f. = 1, 18, P < 0-001) in which semantic 
processing produced a much higher level of recall than did phonemic processing. There was 
a marginally significant main effect of processing distinctiveness (F = 2-83, d.f. = 2, 36, 

P « 0-05) due mainly to superior recall in the dual-task conditions. The comparison of 
non-distinctive and distinctive conditions was non-significant, t < 1, but recall in the 
dual-task conditions was better than in the distinctive conditions (t = 1-72, d.f. = 18, 

0-10 > P > 0-05). These findings indicate a modest enhancing effect of spread of processing 
on recall, in line with results obtained by Craik & Tulving (1975). 


Processing task judgements: Recognition. There were extremely low levels of recall 
performance overall, with 16 out of the 19 subjects failing to recall any items in at least one 
of the six conditions. This precluded any detailed consideration of the accuracy of 
processing task judgements in free recall. Accordingly, the main emphasis will be on 
processing task judgements in the recognition test, for those items correctly recognized. For 
each presentation task, the mean percentages of judgements falling into the six judgemental 
categories are shown in Table 4. It can be seen that subjects were reasonably accurate in 


Table 4. Judgemental task data as a function of initial processing task (figures given are 
mean percentages. Experiment 2) 





Initial processing task 





Distinctive Non-distinctive Dual 
Guesses Phonemic Semantic Phonemic Semantic Phonemic Semantic 
Phon. dist. 43:6 48 26:3 74 34.2 7:2 
Sem. dist. 3-7 44-2 1-4 14-1 2-0 134 
Phon. non-d. 21-1 6-6 38-4 52 22-8 41 
Sem. non-d. 42 18-0 5.7 50:6 3-9 16:1 
Phon. dual 26:5 3.3 274 6-6 34-0 41 
Sem. dual 0-9 23.5 1-0 15:4 2-7 55:3 








their processing task judgements; furthermore, inaccurate judgements tended to retain 
information about the depth of processing. There was a significant interaction between 
processing depth and processing distinctiveness (F = 4:83, d.f. = 2, 34, P < 0-025). In this 
interaction, there were significant effects of processing depth with non-distinctive tasks 
(F = 5-92, d.f. = 1, 51, P < 0-025) and with dual-task conditions (F = 18-11, d.f. = 1, 51, 
P < 0-001) but no effect with distinctive tasks. The greater accuracy for semantic tasks 
than for phonemic tasks is consistent with the greater memorability of semantic 
information (Craik & Lockhart, 1972). Of the main effects, processing distinctiveness did 
not influence judgemental accuracy, but accuracy was significantly greater for semantic 
than for phonemic conditions (F = 12-68, d.f. = 1, 17, P « 0-001). 

If the recognition-test superiority of the phonemic distinctive conditions over the other 
phonemic conditions was due to more extensive semantic and/or phonemic encoding in the 
phonemic distinctive condition, then a possible prediction is that this extraneous processing 
would be reflected by a smaller percentage of accurate processing task judgements in the 
phonemic distinctive than in the phonemic non-distinctive or dual-phonemic conditions. In 
fact, there was no significant overall difference in accuracy among the three phonemic 
conditions. 
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Discussion 


In this experiment, the major finding of the first experiment was replicated, at least when a 
recognition test was used. This was that distinctive processing enhanced memory 
performance on phonemic tasks, but had no effect on semantic tasks. It is important to 
note that this result was obtained when processing time in all conditions was equated. 

There are two possible interpretations of the finding that the manipulation of processing 
distinctiveness at the semantic level had only a marginal effect on memory performance. 
The first possibility is that there was an insufficiently incisive manipulation of 
distinctiveness at the semantic level, and the second possibility is that the retention of 
semantic encodings is intrinsically little affected by variations of distinctiveness. The 
interpretative problem extends to the interaction between distinctiveness and depth of 
processing, which may reflect either a geauine difference in susceptibility to effects of 
distinctiveness at different levels of processing or simply differences in the extent to which 
distinctiveness was manipulated at the phonemic and semantic levels of analysis. In view of 
the findings of Jacoby (1974) and of Moscovitch & Craik (1976), we are inclined to believe 
that effects of distinctiveness at the semantic level could be demonstrated, but we 
appreciate that a complete resolution of the issue would necessitate further research. 

The most natural expectation from a distinctiveness position is that retention of the form 
of processing should be greater after distinctive than after non-distinctive processing. 
However, the actual finding was that the distinctiveness of processing had no effect on the 
accuracy of processing task judgements. Since subjects only accurately judged the 
processing task performed on recognized words on approximately 50 per cent of occasions, 
it is clear that there can be a *match' of information between the memory trace and the 
retrieval environment (Tulving, 1976), without subjects necessarily being consciously aware 
of the basis of the match. 

In both experiments, the differential effect of processing distinctiveness on phonemic and 
. Semantic tasks seems to have been rather greater on recognition than on free recall. Since 
no completely adequate account of the salient differences between recall and recognition is 
available, a somewhat speculative interpretation of the data will be offered. Underwood 
(1969) has argued for the value of distinguishing between discriminative and retrieval 
attributes of memory. While recall relies heavily on retrieval attributes, recognition relies 
on those attributes that serve to discriminate one memory trace from another. If the 
distinctiveness of encoding that was produced by the distinctive processing tasks involved 
the encoding of additional discriminative attributes, then the apparently greater effect of 
processing distinctiveness on recognition memory is explicable. Of relevance here, Tversky 
(1973) obtained evidence that successful recognition depends on encoding enough detail 
about the appearances of to-be-remembered stimuli in order to discriminate them from 
similarly appearing stimuli during the test, whercas recall is enhanced by conditions that 
encourage the formation of associations and interrelations among the items. 

While it may be true, as the distinctiveness hypothesis contends, that the greater the 
dissimilarity between the study-trial encoding and previous encodings of the same item, the 
greater the potential level of recognition, it is clear that the extent to which this potential is 
realized is dependent upon the test-trial encoding context. Since a distinctive encoding is, 
by definition, highly specific to a given situation, such encodings may be thought of as 
*context-bound'. Any difference in the encoding context at study and at test may be 
expected to produce a more substantial decrement in recognition memory for distinctive 
than for non-distinctive encodings. Such considerations may help to explain the greater 
effects of distinctiveness on recognition than on recall. 

A further related way in which processing distinctiveness might affect recognition 
memory is as follows. On the recognition test, each item is encoded on the basis of stored 
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information, i.e. intra-experimental item information plus information from prior encodings 
of that item. In a sense, the subject must decide whether or not the test encoding is the 
result of extra-experimentally acquired information alone or whether it represents an 
amalgam of extra- and intra-situational information. Since there is evidence (e.g. Winograd 
& Conn, 1971) that extra-experimental information will tend to produce a relatively 
non-distinctive or dominant encoding, it would follow that the discrimination between test 
encodings containing intra-experimental information and those not containing such 
information would be facilitated if the intra-experimental study encoding were distinctive. 

Spread of encoding in the dual-task conditions enhanced recall slightly in Expt 2, but it 
did not facilitate recognition. No obvious explanation of this modest differential effect of 
spread of encoding on recall and recognition suggests itself. There was no direct evidence 
that the superiority of recognition performance under the phonemic distinctive over the 
phonemic non-distinctive condition was due to greater spread of encoding at the phonemic 
level. 

The data collected from the processing task judgements did not indicate that the 
recognition superiority of the phonemic distinctive condition over the phonemic. 
non-distinctive condition was due to increased semantic processing in the former condition. 
The mean percentage of processing task judgements of phonemic distinctive items that 
included one of the semantic processing tasks was 8-8, against 8:1 per cent for phonemic 
non-distinctive items, and 8-6 per cent for dual-phonemic items. These percentages do not 
differ significantly from each other. 

There are some similarities between the work reported here and the approach adopted 
by Kolers (1979). In a number of studies, Kolers has used a paradigm in which subjects 
initially read a set of sentences, some of which were in normal orientation and the 
remainder of which were inverted. On a subsequent recognition test, subjects showed some 
retention of typography (a physical characteristic), especially for inverted sentences, even at 
retention intervals of one month or longer. 

While we agree with Kolers that physical forms of processing do not necessarily manifest 
poor long-term retention, there are some problematical aspects of his methodology, and 
some points of difference. For example, the time required to read the inverted sentences in 
his studies was between six and ten times that needed to read the sentences in normal 
orientation, thus confounding typographical orientation and processing time. In addition, 
we are interested primarily in the effects on retention of differences in the stimuli-as- 
encoded, and so the stimuli-as-presented were the same in the various conditions. In 
contrast, Kolers (1979) has consistently used different stimuli-as-presented in his 
experimental conditions, even though he is also mainly interested in retention as a function 
of the nature of the encoded stimulus. A final point of difference is that Kolers has not 
directly compared memory performance of physical and semantic encodings, whereas that 
is a major focus of this study. 

In sum, the evidence from both experiments indicates that the retention differences 
between semantic and phonemic encodings are, at least in part, attributable to greater 
distinctiveness of semantic encodings. Evidence is accumulating that effects of processing 
depth on retention are partially dependent on spread of encoding (Craik & Tulving, 1975) 
and on distinctiveness (Jacoby, 1974; Moscovitch & Craik, 1976). The other major finding 
was that the usual inverse relationship between word frequency and recognition was 
reduced or even eliminated by instructions designed to increase trace distinctiveness, 
suggesting that the customary advantage of rare over common words may be partially due 
to greater encoded distinctiveness of rare words. 
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Relations among symmetry, asymmetry, perceptual comprehension of 
numerals by kindergarten and first grade children 


George S. C. Cheong 





The main purpose of the study was to investigate whether or not the perceptual comprehension of 
symmetrically designed numerals 1s higher than that of asymmetrically designed ones. Other 
peripheral hypotheses on sex difference, differences between children of vanous socio-economic 
classes, and the differences between kindergarteners and first graders were also examined. 
Seventy-three children participated in the study. The results of the study indicated that the major 
hypothesis was rejected, and that some subhypotheses were accepted. Implications were drawn from 
the findings of the study, as they relate to the differentiated functions of the brain, the teaching of 
reading and art, and to other cognitive styles 





Research into the differentiated functions of the left and the right hemispheres has been 
increasing significantly, especially during the past decade or so. Sperry (1968) and Bogen 
(1977) contended that in spite of a great deal of overlap and commonality in their 
functions, the cortical hemispheres of the brain characteristically organize and encode 
information in two different ways. The left cortical hemisphere ın about 98 per cent of the 
right-handed individuals and in about two-thirds of the left-handed individuals specializes 
somewhat in a propositional, analytic-sequential, time-oriented serial organization which is 
well adapted to learning and remembering verbal information. The right hemisphere in 
these same groups of individuals specializes somewhat in an appositional, synthetical- 
gestalt organization well adapted to processing information in which the parts acquire 
meaning from the whole or through their relations with the other parts. 

Kimura (1973) examined hemispheric cognitive processes in normal individuals and 
presented different information to each hemisphere simultaneously by using dichotic 
listening and visual-perceptual tasks. She found in normal right-handers that the left 
hemisphere was better adapted than the right hemisphere at tasks involving auditorily 
presented words, nonsense syllables and backward speech, visually presented letters and 
words, and skilled movements and gesticulations. The right hemisphere was better than the 
left hemisphere at auditory tasks involving melodies and non-speech human sounds; at 
visual tasks involving locating points in two dimensions, dot and form enumeration, 
matching slanting lines, and stereoscopic perception. 

After reviewing studies dealing with patients and normal people, Nehes (1977) concluded 
that the right hemisphere processes spatial relationships and complex and difficult to name, 
tactile, and auditory stimuli. For example, some of these so-called mght hemispheric tasks 
are the orientation of lines, spatial direction, spatial pattern, and the like. 

Having reviewed recent research on the brain, Wittrock (1978), while in agreement with 
the models of encoding, namely verbal-analytic processes and holistic imagery, postulated 
that ‘the hemispheric functions of the brain are distinguished more by the way they 
organize or represent information than by the type of information they organize’. 

Kinsbourne (1975) suggested another theory, that 1s, the development of hemispheric 
functions involves increases in proficiency as well as in learning to allocate attention to the 
hemispheric processes, rather than increases in degree of lateralization. 

Luria (1976) indicated in his summary of the functional organization of the brain that it 
is hierarchically organized to integrate messages coming from lower sources. The brain also 
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specializes within each hemisphere as well as across hemispheres. The conception of 
dichotomous function would do injustice to the sophistication and complexity of the 
human brain. 

Symmetry seems to characterize our man-made environment including objects that we 
have constructed for daily use. Is it more convenient to manipulate symmetrical objects 
than their contrary? Is it more agreeable to our senses to have symmetrical things than 
asymmetrical ones? Is it easier to comprehend symmetrically designed objects than those 
asymmetrically designed? Are manipulatory convenience and sensory agreeableness due to 
our environmental conditioning? If the assumption that there is a certain degree of validity 
underlying the differentiated functions for the left and the right hemisphere of the brain, to 
what extent, if any, is the perceptual comprehension of symmetry and of asymmetry of 
objects related to the hypothesized, differentiated functions of the brain? These questions 
prompted the investigator to examine, as a starting point, the perceptual comprehension of 
symmetry and asymmetry of objects in our environment. 

In Rosen's study (1955), a positive relation was found between training in art and 
preference for complexity and symmetry. Examining the relation between creativity and 
preference for complexity and symmetry, Barron (1963) obtained a positive finding. 
McWhinnie (1966, 1968, 1970) conducted a series of studies which are related to the work 
of Barron and Rosen. In McWhinnie's first study, he found that the scores of boys who 
had received perceptual training were positively related to preferences for 
complexity-asymmetry on the Welsh Figure Preference test (WFPT), but he also found that 
scores on the WFPT were negatively related to rating in figure drawings. In his second 
study, there was a mix of positive and negative findings resultant from his treatment of 
perceptual training. McWhinnie's third study yielded a significant, positive relation between 
those who were given the perceptual training programme and their preferences for 
complexity-asymmetry on the WFPT. 

The purpose of this study was to investigate the following hypotheses: 

(1) The scores on comprehending symmetrically designed numerals perceptually are 
higher than those on comprehending asymmetrically designed ones. 

(2) There is a difference between kindergarten and first grade children on symmetry as 
well as on asymmetry scores. 

(3) There is no significant difference between boys and girls on symmetry as well as on 
asymmetry scores. 

(4) There is a difference between middle and lower (SES) class children on symmetry as 
well as on asymmetry scores. 


Method 
Subjects 


A total of 73 children, consisting of 45 kindergarteners (two classes) and 28 first graders, and by sex, 
38 boys and 35 girls, took part in this study. The children were from a small university town situated 
in Eastern Canada. 


Definitions 
The definition of symmetry or asymmetry can become a problem of controversy, if not explicitly 
described. The kind of symmetry conceived of here was that when folded along a particular line, two 
identica! halves would coincide Figure 1 represents one of the ways to illustrate the concepts of 
numerals 1, 2, 3, 4 and 5 by symmetrical arrangement of pegs, and it was this way that was used to 
arrange pegs in this study in order to depict symmetry 

Figure 2 shows one of the ways to represent the numerals 1, 2, 3, 4 and 5 by asymmetrical 
arrangement of pegs, and it was this arrangement of pegs which was used to depict asymmetry in this 
study. 
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Figure 1. Symmetrical arrangement of pegs. 


4 


Figure 2. Asymmetrical arrangement of pegs. 


It should be made clear that the way in which symmetry and asymmetry were defined in this study 
could be conceived as being conventional, which obviously may not be consistent with other ways of 
defining symmetry and asymmetry. 


Materials 


The materials used in this investigation for determining the relation between perceptual 
comprehension of numerals and symmetry-asymmetry and other related hypotheses were constructed 
in such a way that they appeared to children as a kind of toy. In the writer's judgemert, this kind of 
toy was a novelty to most, 1f not all, of the children involved. The materials, consisting of five sets of 
boards, three symmetrically designed as illustrated 1n Fig. 3, and two asymmetrically designed as 
shown in Fig. 4, were made of paper-board with little pegs inserted 1n it Figure 3 shows a 
three-dimensional symmetrical set, and Fig. 4 an asymmetrical one 





Figure 4. Asymmetrical set. 


In order to reduce the number of variables impinging on the study, the colours of all five sets, 
including the pegs and the card boards, were kept constant; that is, it was deliberately designed that 
children would not associate the colours of the cards with those of the pegs. A sample is shown in 
Fig. 5. 

Since there were five numerals and five sets of materials involved in this study, five colours, namely 
black, red, yellow, orange and blue (in serial order), were used to construct the card, with holes 
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Figure 5. Balanced, coloured cards and boards. Abbreviations: Bl = Black, R = Red; Y = Yellow, 
O = Orange, B = Blue. 
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Asymmetry 


punched in them. The pegs in the board were designed and constructed in such a way that the cards 
would fit exactly the holes on the board. 

As mentioned earlier, the colour arrangement of the card was so designed that there was a balance 
of colours. For example, one set began with black for numeral 1, red for numeral 2, yellow for 
numeral 3, orange for numeral 4, and blue for numeral 5; and another set began with red for numeral 
1, yellow for numeral 2, orange for numeral 3, and so on, as shown in Fig. 5. 

There were three cards for each numeral in a set, and the colour of these three cards was also 
varied and balanced. 

The reason why there were five numerals used in the study (represented by the number of pegs and 
the number of holes in the cards), namely 1, 2, 3, 4 and 5, was that the experimenter wanted to make 
sure that the kindergarten children had no, or as little as possible, difficulty in comprehending 
perceptually numerals up to five. What he was interested in 1n the study was the extent of perceptual 
difficulty whuch might be encountered by these children when the pegs were asymmetrically arranged 
instead of symmetrically arranged Therefore the investigator wanted to examine whether or not, by 
using the materials designed in the study, these kindergarten and first grade children would have 
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difficulty in comprehending the concept of a numeral if the objects (pegs used here) were 
asymmetrically arranged instead of symmetrically arranged. 


Procedure 


Since the materials constructed for this study called for individualized testing, the investigator 
collected data with the assistance of three of his graduate students; however, standard instructions 
and uniform criteria for accepting responses were used in all cases. 

The materials for measuring the perceptual comprehension of symmetry and asymmetry were 
individually administered to all 73 children 1n an appropriate place in the school or instructional 
premises. Children were first briefly shown the materials with cards on their respective pegs. Then the 
cards were unloaded and mixed. Children were then asked to place the cards on the pegs 

For the purpose of having standardized scoring, it was made clear to the investigator's graduate 
students that only the first response in associating the number of holes on the card with the number 
of pegs was acceptable. What this meant was that a child's first association of a 3-hole card with 3 
pegs was credited, despite the fact that she/he might have difficulty in actually inserting the card into 
the pegs. In other words, if a child's first response was to associate a 3-hole card with 4 pegs, his/her 
trial in this instance was not given credit. 

It was also made clear to the experimenter's assistants that in order to qualify for the perceptual 
comprehension of a numeral, a child had to associate successfully, on the first response basis, all three 
cards (100 per cent correct) with that particular number of pags. 

Generally, children were cooperative because they liked this kind of game. For the kindergarteners, 
at took them about 15 min, on average, to complete the test. For the first graders, the averaged time 
was roughly 11 min. 

The test administrators reported that the numerals which were often not successfully associated by 
the children on their first trial were 3, 4, and 5; children had no difficulty in associating the number 
of holes in cards with the appropriate number of pegs for numerals 1 and 2. 

In the meantime, information on these children's parents’ occupations was obtained from the 
school authorities. Warner & Lunt's (1942) guidelines for categorizing these occupations were 
followed. 

Since there were three symmetrical and two asymmetrical sets, the scores on symmetry and 
asymmetry were adjusted 1n order to be comparable. Therefore, the score range for symmetry as well 
as that for asymmetry was from zero to 30 points respectively. 


Results 


Data were analysed in accordance with the hypotheses of the study, and the results are 
presented below. 

(1) For hypothesis no. 1, a ¢ test (Downie & Starry, 1977) was applied to them, first by 
using kindergarteners and first graders combined, and the result of t = 0-51, d.f. = 71, was 
not statistically significant at 0:05 level. Then the kindergarten and first grade children were 
separated for further ¢ tests, the results of t = 0-77, d.f. = 44, and t = 0-50, d. f. = 27, 
respectively, were not statistically significant at 0-05 level. Therefore, this hypothesis was 
rejected. 

(2) The difference between kindergarteners and first graders on symmetry scores, with a t 
value of 5-22, d.f. = 71 (Guilford & Fruchter, 1978), was statistically significant beyond 
0-001 level (the mean for first graders = 28:85, the mean for kindergarteners = 23-02). 
What this finding meant was that first graders in this study performed significantly better 
than kindergarteners on their perceptual comprehension of symmetrically arranged 
numerals. 

A t test was also applied to the difference between kindergarteners and first graders on 
asymmetry scores. À t value of 4-63, d.f. — 71, obtained was statistically significant beyond 
0-001 level (the mean for first graders = 27-75, the mean for kindergarteners = 21-53). This 
finding meant that the first graders performed significantly better than kindergarteners on 
their perceptual comprehension of asymmetrically arranged numerals. 


280 George S. C. Cheong 


(3) The differences between boys and girls, with kindergarten and first grade combined, on 
symmetry scores (the mean for boys = 25-00, the mean for girls = 25-60) as well as on 
asymmetry scores (the mean for boys = 24-20, the mean for girls = 23-60) were not 
statistically significant at 0-05 level. Therefore, the null hypothesis was accepted. 

(4) The differences between children from the middle class and those from the lower class 
(with kindergarteners and first graders combined) on symmetry scores were found 
statistically significant at 0-01 level, with a ¢ value of 3-92, d.f. = 71 (the mean for 
middle-class children = 28-20, the mean for lower-class children = 23-50). What this finding 
meant was that middle-class children did significantly better than lower-class children on 
their perceptual comprehension of symmetrically arranged numerals. 

Again with kindergarteners and first graders combined, a ¢ value of 2-19, d.f. = 71, was 
found for the difference between middle-class children (the mean = 26-00) and lower-class 
children (the mean = 22-70), statistically significant at 0-01 level. This finding meant that 
middle-class children did significantly better than lower-class children on their perceptual 
comprehension of asymmetrically arranged numerals. ' 


Discussion 

The investigator was cognizant of the fact that the sample of the study was relatively small, 
and that it would strengthen the external validity of findings if the sample size were 
increased. 

It would also augment the internal validity of the study if one test administrator were 
used in order to reduce the possible effect due to a number of test administrators being 
involved. . 

Perhaps it would be appropriate to eliminate numerals | and 2 in both symmetrical and 
asymmetrical sets, because they appeared too easy for kindergarteners as well as for first 
graders. Instead it might be feasible to extend the numerals to 6 and 7 if the kindergarteners 
(5 years old) have no difficulty in comprehending them. 

As revealed in the statistically significant findings, the first graders did significantly better 
than the kindergarteners in both the symmetrically and asymmetrically arranged numerals. 
Such a finding can only be regarded as validating our belief that first graders are more 
mature mentally, in this instance, than kindergarteners. 

The second significant finding was that middle-class children performed significantly 
better than lower-class children in both the symmetrically and asymmetrically arranged 
numerals. For this kind of finding, perhaps one might say that middle-class homes are 
richer, intellectually more stimulating in terms of human and material resources than 
lower-class homes. Hence, this finding reinforces the validity of one of our commonly held 
beliefs. 

Witkin et al. (1962) reported that field dependent persons tended to experience the 
environment more globally and to conform to the influence of the surroundings, whereas 
field independent individuals perceived their surroundings more analytically. More 
explicitly, this meant that the field independent persons appeared to inhibit their 
responding until all available alternatives were evaluated. Davis & Haueisen (1976) found 
that field independent persons were more efficient learners than field dependent persons, 
meaning that they were better at hypothesis testing and problem-solving. Stuart (1967) 
compared pupils reading on grade level with those reading below grade level and obtained 
a significant positive interaction between field independence and reading achievement. It 
was suggested that difficulties in reading might be related to perceptual style which in turn 
could be a manifestation of cognitive style. Gluck (1972) reported a significant relationship 
between understanding the meaning of a paragraph and field independence. In the light 
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of these findings on cognitive style, the perceptual comprehension of symmetry and 
asymmetry could have implications for the teaching of reading. Is the perceptual 
comprehension of symmetry and asymmetry related field independence and dependence? Is 
such perceptual comprehension also related to the reading process? These questions should 


be examined in subsequent studies. 


Elkind and his associates (Elkind & Scott, 1962; Elkind e: al., 1964) contended that the 
processes involved in perceptual growth are the basic development of decentration. That 
is, when children were shown a series of drawings at different age levels, the nursery school 
children saw only parts while kindergarten and first grade children saw primarily wholes, 
and children at the second grade level and beyond saw the parts and wholes in an 
integrated fashion. How is Elkind’s decentration process related to the perceptual 
comprehension process of symmetry and asymmetry as developmental as the decentration 
process? What implication is there for the teaching of art? Elkind et al. (1965) also found 
that while tactile discrimination of sandpaper letters was positively related to reading 
achievement among beginning readers, it was negatively correlated with reading 
achievement among advanced readers. In other words, it is customary for the beginning 
reader to use his finger as a marker for printed matter. But once he becomes an advanced 
reader, using a finger as a marker would impede his reading. Again this finding could mean 
that the reading process might be related to the perceptual comprehension process of 


symmetry and asymmetry. 


In a rather comprehensive review of the literature on sex differences, Maccoby & Jacklin 
(1974) reported that the verbal abilities of boys and girls are quite similar until early 
adolescence. Even though their review did not mention perceptual comprehension, the 
finding of this study tends to support their generalization on this aspect. 

It is apparent that the study did not include the use of the Frostig Development Test of 
Visual Perception (Frostig et al., 1964) as a measure to discern children’s level of 
visual-perception development. However, Coles (1978), after an excellent review of the 
literature, reported that the validity of the Frostig test and other tests for detecting learning 
disabilities is rather limited, and he contended further that the problems of learning 
disabilities require social solutions rather than biological explanations. 
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Heart-rate change as a measure of verbal storage and retrieval 


Donald P. Spence and David R. Beyda 





Heart rate and respiration were monitored while the subject listened for the appearance of a target 
sentence in a subsequent block of text. In two-thirds of the trials, the target sentence was altered by 
major or minor changes in wording, and the subject was asked to indicate whether or not a change 
had occurred. Heart rate preceding the first appearance of correctly recognized target sentences 
showed a steeper acceleration than heart rate preceding incorrectly recognized sentences. Mean heart 
rate during the correctly recognized embedded sentences showed a steeper deceleration than was 
present when the wording change was not recognized. Respiration amplitude decreased significantly 
in the vicinity of the embedded sentence (perhaps reflecting some kind of ‘Ah-ha’ effect) but did not 
discriminate between hits and misses, and did not correlate with change of heart rate. It was 
concluded that falling heart rate may be triggered by a shift in style of listening set in motion when 
the subject attends more closely to the specific words in the sentence as opposed to its underlying 
meaning. 





Language is transparent — by which we mean that we ‘listen through’ the words in an 
utterance in order to understand its underlying meaning. As a number of studies have 
made clear (see Jenkins, 1974, for a review of recent experiments) our recognition of the 
specific words in an utterance is frequently no better than chance. So long as a test 
sentence agrees with a target sentence with respect to meaning, it tends to be accepted as 
the target sentence, even when quite different words are employed (Bransford & Franks, 
1971). In a related experiment by Sachs (1967), subjects listened to 28 taped passages and 
heard after each one either a test sentence which was identical to a sentence from the 
original passage or one which resembled it closely. Thus the target sentence might be ‘He 
sent a letter about it to Galileo, the great Italian scientist’ and one of the following 
sentences would be presented as a test: the target sentence; a formal transform: ‘He sent 
Galileo, the great Italian scientist, a letter about it’; a semantic transform: ‘Galileo, the 
great Italian scientist, sent him a letter about it’; or a passive transform: ‘A letter about it 
was sent to Galileo, the great Italian scientist’. Subjects were instructed to study the 
wording carefully and mark ‘changed’ if there was any change at all in the test sentence. 
Recognition was tested immediately after the passage, or after 80 or 160 syllables of 
interpolated text (approximately 27 and 46 seconds later, respectively). 

Sachs found that the recognition of semantic changes remained high (between 70 and 90 
per cent correct) during all three conditions, but that the recognition of passive, formal and 
identical wordings fell off rapidly. Sachs concluded that ‘Even when the meaning of a 
sentence was remembered, formal properties that were not necessary for that meaning were 
forgotten very quickly. The results suggest that the original form of the sentence is stored 
only for the short time necessary for comprehension to occur... Thus the memory of the 
meaning is not dependent on memory of the original form of the sentence’ (Sachs, 1967, 
p. 437). 

These findings bring to mind an earlier observation by Polanyi. In describing his 
experience in reading his correspondence in a foreign language, he writes as follows: ‘I am 
vividly aware of the meaning conveyed by the letter, yet know nothing whatever of its 
words. I have attended to them closely but only for what they mean and not for what they 
are as objects. If my understanding of the text were halting, or its expressions or its 
spelling were faulty, its words would arrest my attention. They would become slightly 
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opaque and prevent my thought from passing through them unhindered to the things they 
signify’ (Polanyi, 1964, p. 57). 

Although these and other observations agree that the original surface structure is 
generally preserved only long enough to provide us with an underlying meaning, there may 
be conditions under which more specific retention may occur. At least two time periods are 
significant. Close attention to the first presentation of the target sentence is necessary for 
initial storage of its meaning, and additional attention might result in more lasting storage 
of the individual words. An independent measure of attention at this time would allow us 
to understand variations in later performance. Similarly, variations in attention to the 
alternative wording, following the block of text, may well influence the subject’s ability to 
make a correct match. 

In an earlier study (Spence et al., 1974), we showed that if a subject is alerted to specific 
themes in a continuous text, his heart rate (HR) tended to decelerate at the times when 
these themes appear. These findings extend the original studies of the Laceys (1963, 1978) 
which showed that HR deceleration is triggered by attention to an external stimulus. In our 
earlier experiment, subjects were asked to attend to direct and indirect references to the 
general theme of termination of psychotherapy as they were listening to a 17 min passage 
spoken by the patient. Heart rate dropped in the vicinity of the termination clues and rose 
in the vicinity of the unrelated control clues. In the same experiment, we also discovered 
that the amount of deceleration coinciding with a termination clue could be used to predict 
its later recall, suggesting that the steeper the deceleration curve, the greater the attention 
being devoted to processing and storing the textual information. But these findings could 
not discriminate between storage of meaning and storage of precise wording. 

If we apply these findings to a sentence recognition situation, we might discover that the 
amount of attention paid to the target sentence can be gauged from the shape of the HR 
wave form which appears when the target sentence is presented. Sentences which coincide 
with certain wave forms may be better remembered than sentences which show other wave 
forms. The general finding that the specific wording of the sentence is forgotten very 
quickly may be somewhat overgeneralized, and we may discover that the amount of precise 
recall will depend on how much attention is invested in listening to the original target 
sentence. In similar fashion, the amount of attention invested in the transformed sentence, 
as measured by the steepness of the deceleration curve, might determine the accuracy of the 
recognition response. 

To this end, we modified Sachs’ procedure, presented the target sentence first, and then 
asked subjects to listen to a subsequent paragraph containing either the original sentence 
or some transformation of it. We monitored HR at the time when the target sentence was 
first presented, and later when the embedded sentence (the original target sentence or its 
transform) occurred in the block of text. We predicted that the shape of the HR wave form 
at both points in time would tell us something about the likelihood of a successful match. 
The wave form during the initial target sentence might be an index of the extent to which it 
was registered in long-term memory. The second wave form might be an index of the 
amount of attention being used to process the embedded sentence, extract its original 
wording and compare it with the wording of the target sentence. 


Procedure 


Twenty-six male NYU undergraduates were monitored for respiration and EKG while they listened 
to a series of 12 paragraphs. Respiration was measured by a mercury strain gauge around the chest 
connected to a Beckman eight-channel polygraph; from the recording we measured amplitude and 
frequency of respiration in the vicinity of the embedded sentence. Leads for heart rate (right wrist and 
left ankle — right ankle was ground) were connected to an Electronic Industries data-logging system 
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which made a digital recording of beat-by-beat HR continuously throughout the experiment. The 12 
paragraphs were tape-recorded on one channel of a stereo tape; inaudible tones on the second 
channel marked key segments of the stimulus, and were recorded on the digital tape. In this way, 
specific HR could be precisely correlated with specific portions of the stimulus tape. The same set of 
tones was used to mark the polygraph record, allowing us to relate respiration to HR and to the 
verbal stimulus. 

After the electrodes had been attached, the subject was told that he would hear a sentence, 
followed by a brief pause, and then a short text. He was to listen carefully to the text in order to 
determine whether the target sentence appeared in the text exactly as ıt had when first presented. At 
the end of the text, the subject was asked to push a ‘Yes’ button (on an adjoining signal box) if the 
embedded sentence was the same as the target sentence, and a ‘No’ button if different (the two 
buttons made distinctive marks on both the digital tape and the polygraph). Button press was 
delayed until the end of the text in order to eliminate intervening muscle artifacts. The subject was 
told to expect occasional repetitions of the target sentences within a text in order to maintain attention 
throughout the text; should he hear the sentence twice, he was to push the signal button twice. (In 
actuality, only one catch trial was given; it followed the sixth text block, and was not counted in the 
scoring.) 

The target sentence appeared in the subsequent text in three ways — unchanged (trials 2, 4, 8 and 
10); changed in wording but only marginally changed in meaning (trials 1, 6, 9 and 11); and grossly 
changed in meaning (trials 3, 5, 7 and 12). (Thus if the target sentence was ‘David looks up again to 
white mountains across the valley’, a marginal transform would be ‘The white mountains across the 
valley catch David’s attention’. If the target sentence was ‘The companies in return will not pay a 
bonus this year’ a major transform would be ‘The companies, in return, will pay every worker a $150 
bonus’.) To be correct, the subject would say ‘Yes’ to the first set (no change in wording) and ‘No’ 
to the second and third sets. 

After the instructions had been understood, the subject was given two practice trials, followed by a 
5 min rest period. An 11 s sample of HR was recorded midway through the rest period. We then 
presented (by tape recorder) the 12 sets of target sentences and text blocks. In each set, the target 
sentence was first announced and then read; after a 2 s pause, the text block was read, followed by a 
30 s silence for the subject’s response. The texts were drawn from a variety of magazines and books; 
they included both fiction and non-fiction. Each selection lasted from 3 to 4 min. The embedded 
sentence (mean length: 11:5 words, lasting approximately 5 s) appeared from 1-6 to 2-5 min after the 
start of the text, marked by a tone on the inaudible track of the stimulus tape. Two other inaudible 
tones appeared, the first about 1 min from the beginning of the text (called control before) and the 
second about 1 min after the embedded sentence (control after). Control tones always appeared at 
sentence boundaries. Heart rates were computed for the five seconds before and after each marker 
tone. Position of the target sentence was determined after the data were collected (a marker tone was 
inadvertently omitted); heart rates were computed for the five seconds just before the target sentence 
was announced, and for five seconds after it began. 


Results 

Heart rate 

Group findings. The 11 s profiles for rest, target sentence, control before, embedded sentence 
and control after were averaged across all trials for each subject and analysed by a 
fepeated-measures analysis of variance (26 subjects x 5 conditions x 11 s). Mean profiles for 
each condition are shown in Fig. 1. The main effect for condition is significant (F — 3-85, 
d.f. — 4, 100, P « 0-01), indicating that mean HR varies according to the task; the main 
effect for time is significant (F — 2-85, d.f. — 10, 250, P « 0-005), indicating that second-by- 
second HR varies across the 11 s ‘window’; and there is a significant interaction of time by 
condition (F — 3-95, d.f. — 40, 1000, P « 0-001), indicating that the shape of the HR wave 
forms differ across the five conditions. 

We next looked more closely at the wave forms for target sentence, control before and 
embedded sentence, comparing them two at a time in all three possible combinations. Each 
pairing shows a significant interaction of condition by time, with the largest F appearing in 
the comparison of target sentence and embedded sentence (F = 7-01, d.f. = 10, 250, 
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Figure 1. Mean heart rate for 26 subjects over five conditions. In this and other figures, the heart rate 
ts measured in beats per minute during each second of an 11 s period that is centered on the start of 
the sentence (with the one exception of rest when there was only silence). 


P « 0:001). We next looked at the shape of the wave form for each condition. The wave 
form for target sentence shows a significant quadratic component (F = 8:59, d.f. = 1, 250, 
P < 0-005); the wave form for embedded sentence shows significant linear, quadratic and 
cubic components (Fs — 86-67, 9-72 and 7-22; P « 0-001, 0-005 and 0-01); and the wave 
form for control before is not significant on any dimension. The significant cubic wave form 
for embedded sentence (sharp deceleration) is similar to that found by Spence et al. (1974) 
in the earlier study of recognition of transformed themes in a continuous verbal text, and to 
that reported by Roessler et al. (1969) in a study of the recognition of random tones. 

The profiles for target and embedded sentences were next divided into correct 
recognition (with the subject saying ' Yes' when the embedded sentence was the same as the 
target sentence and ‘No’ when it was different); and incorrect recognition. In general, 
correct recognition required the subject to detect a difference of only one word between the 
target sentence and its embedded transform; as a consequence, close attention was 
required. Overall, subjects were correct on 79 per cent of all trials and their accuracy was 
about evenly divided over the three classes of transformations (no change, slight meaning 
change, gross meaning change). 

We averaged each subject's wave form for the embedded sentences separately for correct 
and incorrect trials and compared each average with the average of the corresponding wave 
form for the target sentence. Mean scores for correct recognition over all conditions are 
shown in Fig. 2, with mean rate for control before added for purposes of comparison. An 
analysis of variance comparing target sentence with embedded sentence shows a clear 
difference between conditions (F — 14-29, d.f. — 1, 25, P « 0-001) and a significant 
interaction between condition and time (F = 4-07, d.f. = 10, 250, P < 0-001). 

Figure 3 presents the corresponding curves for incorrect trials. The difference between 
conditions is not significant, and there is no significant interaction of condition x time. 
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Figure 2. Mean heart rate for correct recognition when the target sentence was first presented, during 
an intervening control sentence, and when the embedded sentence was correctly recognized. 
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Figure 3. Mean heart rate for mcorrect recognition. 


Differential sensitivity. Each subject’s record of correct and incorrect matches was 
summarized by A’, a statistic with many of the same properties as d’ but less affected by 
the underlying distribution (see Grier, 1971). Like d', it takes account of hits and misses 
while correcting for response bias. Subjects were divided into 13 highly sensitive (high A’) 
and 13 less sensitive subjects (low A’). We then compared the two groups on HR during 
target sentence, control before and embedded sentence.* 

In the first analysis (target sentence x control before) there is a significant difference 


* We are indebted to Dr Hollis Scarborough for suggesting this statistic and for carrying out the necessary 
computations 
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Figure 4. Mean heart rate for good discriminating subjects (High A’, n = 13) and poor discriminating 
subjects (Low A’, n = 13). 


between conditions (HR is higher during target sentence, F = 6:12, d.f. = 1, 24, P < 0-01), 
a significant interaction of condition x time (F — 3-22, d.f. — 10, 240, P « 0:001) and a 
significant triple interaction of group x conditions x time (F = 1-99, d.f. = 10, 240, 

P < 0-05). The last finding reflects the fact that the more sensitive subjects show a more 
pronounced acceleration as they are anticipating the target sentence, and a more 
pronounced deceleration when it is presented than do the less sensitive subjects. 

In the second analysis comparing embedded sentence with control before (Fig. 4), there 
was a significant interaction of condition x time (F — 7-02, d.f. — 10, 240, P « 0-001) and a 
significant interaction of group x condition x time (F = 2:35, d.f. = 10, 240, P < 0-01), 
stemming from the fact that the more sensitive subjects (high A4") show a much more 
pronounced HR deceleration during the embedded sentence than do the less sensitive 
subjects (compare the two halves of Fig. 4). If we compare Figs 1 and 4, it is apparent that 
the sharp drop in HR during the embedded sentence is contributed primarily by the more 
sensitive subjects. Amount of deceleration during the embedded sentence from time t — 0 to 
t = 4 is correlated 0-50 (P < 0-01) with Æ’. 


Predictors of recognition. As shown in Fig. 2, HR for correct trials began to decelerate 
rapidly as soon as the target sentence or its transform was heard (look at the wave form for 
embedded sentence). This deceleration suggests some form of recognition similar to that 
found by Spence et al. (1974) when subjects identified relevant themes in the 17 min 
interview. The deceleration may also reflect a further step in processing whereby the specific 
wording of the embedded sentence is being matched against the stored image of the target 
sentence. This hypothesis would suggest that the extent of deceleration during the 
appearance of the embedded sentence (roughly indexed by the change in HR from time 

t = 0 to t = 4) might be used to predict overall success. We computed mean HR for all 
subjects for each of the 12 embedded sentences, and computed the change in HR from time 
t = 0 to t = 4s for each average wave form (see Fig. 2). Similar change scores were 
computed for tbe target sentences, averaged over all subjects, for the period from time 
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= —4 to t= 0 (the time of steady increase shown in Fig 2 for the target sentence wave 
form which corresponded to the time when the experimenter was saying ‘The target 
sentence is...’); and for the period from t = 0 to t = 4 when the target sentence was first 
presented. Correlations were then determined between each set and success in recognizing 
the embedded sentence, using the percentage of correct matches as our criterion. 

The best predictor of group recognition is the increase in HR just before the target 
sentence is introduced (time t = —4 to t = 0); it correlates 0-63 (P < 0-05) with overall 
success over the group of 12 sentences. The four sentences with the greatest number of 
correct matches show a mean increase of 2-55 beats per minute; the next five show a mean 
increase of 1:68 beats/min; and the three with the lowest number of correct matches show a 
mean decrease of 0-90 beats/min. These data show that increase in HR during the 
preparatory period just prior to the registration of the target sentence predicts later success 
on the matching task, suggesting that HR acceleration may be an indication that the 
subject is about to pay maximal attention to the target sentence. A similar trend emerges 
when the same data are examined for individual subjects. For the time period t = —4 to 
t = 0 for each subject, the mean increase in HR for all sentences which he later matched 
correctly is greater than the mean increase for all sentences which he later matched 
incorrectly (P « 0-04; one-tailed sign test). 

The second best predictor of group success is the drop in HR when the sentence is 
presented again (f = 0 to 1 = 4 for the embedded sentence). When we averaged HR over all 
subjects for each sentence and correlated change scores with matching performance, 
decrease in HR correlated 0-52 (P « 0-05) with overall success. The four sentences which 
showed the best overall recognition by the group showed a mean decrease of 3:2 beats/min; 
the next five show a mean decrease of 2-27 beats/min; and the three with the worst overall 
recognition show a mean decrease of 0-80 beats/min. When we look at scores for individual 
subjects, the same relationship appears: average HR decrease during the period t — 0 to 
t = 4 is significantly greater for correct than for incorrect matches (P < 0-02; one-tailed 
sign test). 

Drop in HR during the time the target sentence is first presented (t = 0 to t = 4) does 
not correlate with group success (rho = 0-21). However, the data for individual subjects 
show that the drop 1n HR for correct matches tends to be greater than for 1ncorrect 
matches (P = 0-04, one-tailed sign test). 

If we combine the two best predictors of group success - HR increase during the 
preparatory period and HR decrease as the subject listens to the embedded sentence — by 
adding the two sets of ranks and reranking the sums, we have a Spearman correlation of 
0-79 (P « 0-01) with the criterion. In other words if we know how much the subject's HR 
increased during the preparatory period and how much it decreased when he was presented 
with the embedded sentence, we have accounted for more than half the variance and we 
can make a fair estimate of his success at detecting lexical change. 


Respiration 


What triggers the change in HR that accompanies close attention? Because of the close 
connection between HR and breathing pattern, it seemed logical to assume that a subtle 
change in either frequency or amplitude of respiration, initiated by recognition of the 
embedded sentence, might precipitate a rapid slowing of HR. We measured change in both 
amplitude and frequency of respiration before and after control before and before and after 
the embedded sentence. Amplitude of respiration in the 5 s period following the start of the 
correctly recognized embedded sentence was significantly lower than amplitude in a similar 
period before the sentence began (t = 2-96, P < 0-005). Amplitude of respiration was also 
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significantly lower following incorrectly recognized target sentences than amplitude before 
the sentence began (1 = 2-80, P < 0-01). Change in amplitude following hits was not 
significantly greater than change following misses. Amplitude showed no change before and 
after the first control period. Frequency of respiration did not change significantly during 
either correct or incorrect recognition and did not change during the control period. 

To compare HR and respiration, we computed the difference between mean HR in the 
5 s period before and after the start of the embedded sentence, and correlated these scores 
with similar differences in respiration amplitude. Correlation for correct recognition is 
positive but not significant (r = 0-23); correlation for incorrect recognition is 0-09. Thus the 
change in respiration triggered by the embedded sentence cannot be used to explain the 
deceleration in HR. 


Discussion 


In 79 per cent of all trials, our subjects were able to discriminate correctly between the 
target sentence and its embedded transform, despite the fact that the differences were subtle 
and that more than | min had elapsed between the presentation of the two forms to be 
compared. Success on the matching task is apparently related to what happens during two 
time periods — registration of the initial target sentence and processing of the embedded 
sentence — and we now look more closely at each of these intervals. 


Initial registration. Registration of the initial target sentence can be divided into two 
phases — the preparatory period, when the subject is told that a sentence is coming, and the 
appearance of the sentence. Acceleration of HR during the preparatory period is our best 
predictor of subsequent success on the matching task; deceleration during the appearance 
of the sentence is a marginally good predictor of individual success but a poor predictor of 
group results. The biphasic nature of the overall registration wave form can be clearly seen 
in Fig. 2. A comparison of Figs 2 and 3 suggests that a biphasic wave form is a prerequisite 
for matching success. 

We can best understand the initial acceleration in terms of activation theory. This theory 
would suggest that HR increase can be used as a measure of task demand (see Bergum, 
1966; Dahl & Spence, 1970). We might assume that changes in acceleration would reflect 
changes in the subject's anticipation of the coming target sentence — in other words, 1n his 
readiness to process the information about to be presented. Acceleration during the 
preparatory period in this experiment is reminiscent of acceleration during the initial phase 
of a cognitive task in a study by Kahneman et al. (1969). In their experiment, subjects were 
instructed to add either 0, 1 or 3 to each of four serially presented digits. HR steadily 
accelerated between the time when the adding number was presented and the time when 
the digits were heard, once again suggesting that HR increase may reflect readiness to 
process information. What is significant about our findings is the fact that the degree of 
acceleration can be used to predict success in the later matching task. 

Further insight into this correlation comes from an analysis of the different types of 
sentences. Although HR increase during the preparatory period is a significant predictor of 
general success at sentence matching, it is particularly good as a predictor of matches 
where the wording has changed but the meaning has remained roughly the same (trials 1, 
6, 9 and 11); the correlation between HR increase and matching performance on this group 
of sentences is 0-80. HR increase is somewhat worse at predicting success on the no-change 
sentences (rho = 0-35) and makes no contribution to predicting success on sentences where 
the meaning changes as well as the words (rho — 0-00). These differences correspond to 
differences in task demand. To detect a change of wording without a change in meaning 
demands very close attention; to detect a change in meaning demands relatively little 
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attention, Accurate registration of the target sentence is critical in the first case and not in 
the second. These data suggest that the degree of acceleration during the preparatory 
period may reflect the extent to which the subject is prepared to process highly complex 
information. When highly aroused, he may be prepared to store the individual words in the 
target sentence and therefore be in a better position to detect slight changes on the lexical 
level. But arousal may be less relevant to the detection of change in meaning — hence HR 
acceleration does not correlate with recognition of meaning-change sentences. 

Inspection of Fig. 3 suggests that some kind of anticipatory readiness, indexed by HR 
acceleration, 1s a prerequisite for accurate registration of the target sentence. The analysis 
of variance for incorrect matches shows that the wave form for target sentence registration 
is not significantly different from the wave form for the random sentence measured in 
control before; hence we can conclude that the subject was paying only casual attention to 
those target sentences he later failed to recognize. 

We next come to the appearance of the target sentence. Deceleration during this phase 
(t = 0 to ¢ = 4) is a marginally good predictor of individual recognition but a poor 
predictor of group results. The low-to-moderate correlations may reflect the fact that HR 
decrease during this period is partly a recovery from the peak reached at t — 0 (see Fig. 2). 
Thus the extent of the drop may be determined more by the size of the anticipatory 
increase than by the specific content of the target sentence (HR changes during the 
preparatory and appearance periods are moderately correlated: rho = 0-49, P < 0-10). 
Perhaps for this reason, HR deceleration during this period is less predictive of matching 
success. : 

The extent to which HR drop during appearance predicts later success in the individual 
data can be interpreted as supporting the Laceys' hypothesis that HR decrease is triggered 
by attention to external stimulus. In one study (1978), they showed that reaction time is 
correlated with HR deceleration, and that the length of the RT is inversely related to the 
amount of deceleration. Schell & Catania (1975) have applied this hypothesis to sensory 
processing and have shown that HR deceleration is correlated with success at detecting a 
near-threshold light. In the present study, it could be argued that to the extent that 
deceleration to the target sentence is not merely a recovery to base-line, it reflects the 
subject's readiness to register the sentence. 

Further insight into the function of bradycardia in this situation can be gathered from an 
examination of the fate of different types of target sentences. Heart-rate decrease during this 
period best predicts recognition of sentences where the wording changes but the meaning 
does not (the same group of sentences best predicted by HR increase during the 
preparatory period). HR deceleration may reflect the accurate registration of individual 
words in the target sentence which prepares the subject to detect a later change in wording. 
HR deceleration does not predict recognition of sentences in which meaning changes as 
well as words, and it might be argued that detection of meaning change does not require 
the kind of careful attention reflected in HR deceleration. 


Embedded sentence. When the subject hears the embedded sentence, he probably knows 
immediately that he has encountered familiar material, but only a secondary, closer 
analysis of the individual words and their sequence will allow him to identify the transform 
as the same or different from the target sentence. The initial recognition response is 
apparently reflected in a change in respiration amplitude; it can be seen as a kind of 
*Ah-ha' reaction, triggered by the approach of familiar material. The subject has discovered 
the convergence between the ongoing text and the remembered target sentence and is 
preparing to make a more detailed analysis. This interpretation accounts for the fact that 
respiration amplitude decreased significantly before and after the embedded sentences but 
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did not change before and after the control sentences. But respiration did not discriminate 
between correct and incorrect recognition, and this finding is in keeping with the fact that 
the sensed convergence between text and target sentence would not necessarily predict the 
subject’s matching performance. 

Systematic word-by-word processing is apparently reflected in a steady HR deceleration 
which reaches its minimum at approximately the end of the embedded sentence (mean 
length = 5 s). We suggest that while the subject is listening to the embedded sentence, he is 
attempting to compare its surface structure with the surface structure of the target sentence. 
The fact that HR decelerates during this period is another example of the *bradycardia of 
attention' (see Lacey & Lacey, 1978). When the subject realizes that a target sentence may 
be approaching, he becomes unusually attentive to the word-by-word sequence because 
only then will he be able to detect subtle lexical changes. We hypothesize that it is this 
extra deployment of attention which results in the deceleration which begins about the time 
the embedded sentence appears and ends about when it stops. Comparison of Figs 2 and 3 
shows that deceleration during this period is clearly a prerequisite for accurate recognition; 
when the HR does not decelerate (Fig. 3), the embedded sentence is misclassified. We have 
also seen that the extent of deceleration is correlated with success; this correlation would 
suggest that the more the HR is slowed during the period t = 0 to t = 4, the more attention 
is being deployed and the more carefully the subject is monitoring the embedded sentence. 
Deceleration, in other words, is not simply a recognition response, triggered by the arrival 
of familiar material; it is also an index of the degree of information-processing being carried 
out. 

It is worth pointing out that deceleration may index the deployment of attention inward 
as well. Once he extracts the salient features of the embedded sentence, the subject must 
compare it with his memory image of the target sentence and to perform this task, he must 
search his internal store. We have no way of knowing how much of the deceleration is a 
function of this internal search and how much is a function of external scanning. Future 
studies are needed in which we vary the delay between initial presentation of the target 
sentence and its later appearance in the text. Under these conditions, it would be possible 
to analyse deceleration as a function of elapsed time. Since internal scanning would 
presumably diminish over time, we would expect that the earlier wave forms would reflect 
both internal and external search, whereas the later wave forms would reflect mainly 
external search. 
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Circadian rhythms in human memory 


Simon Folkard and Timothy H. Monk 





Two experiments are described that examined the influence of time of day of presentation on 
immediate and delayed retention, and the potential effects of time of day on retrieval from long-term 
memory. Time of presentation was found to influence both the immediate and delayed (28 day) 
retention of information presented in naturalistic contexts. However, while the trend in immediate 
memory over the normal waking day found in Expt i was exactly that predicted by a unidimensional 
arousal theory, the results of Expt 2 indicated that different circadian factors may be responsible for 
the time of presentation effects on immediate and delayed retention. Neither experiment yielded any 
evidence that time of day affects people’s ability to retrieve information from long-term memory. The 
results are discussed within a circadian rhythm framework, and would appear to necessitate the 
adoption of a multifactor theory. It is suggested that further research 1s needed on (a) the effect of 
time of presentation on delayed retention, and (b) the nature of the changes in the encoding/storage 
processes responsible for such effects. 





This paper is concerned with the effects of time of day of presentation on immediate* and 
delayed retention, the interpretation of such effects in terms of changes in arousal level, and 
the potential effects of time of day on retrieval from long-term memory. 

In an earlier study (Folkard et al., 1977), the time at which schoolchildren were read a 
story was found to differentially affect their immediate and delayed retention of the 
information presented in it. Immediate memory was superior following presentation at 
09.00, but after a delay of 7 days those children that had originally heard the story at 15.00 
remembered more than those that had heard it at 09.00. However, these delayed memory 
scores appeared to be unaffected by the time of day at which the delayed retention test 
itself was given. 

These effects of time of presentation on immediate and delayed memory were assumed to 
reflect the circadian rhythms that are known to exist in most physiological functions 
(Conroy & Mills, 1970). More specifically, it was suggested that they were mediated by 
changes in basal arousal level over the day. Colquhoun (1971) has argued that, with the 
exception of a post-lunch dip, arousal level increases over most of the waking day to reach 
a maximum at about 20.00, and to thus parallel the circadian rhythm in body temperature. 
This assumed variation in basal arousal level led Baddeley et al. (1970) to predict that 
afternoon or evening, as opposed to morning, presentation should result in superior 
delayed retention, since high arousal at presentation has consistently been found to benefit 
long-term memory (Craik & Blankstein, 1975). This prediction has subsequently received 
some support from the results of Hockey et al. (1972) and Folkard et al. (1977). In 
contrast, immediate memory has usually been found to be superior in the morning, and 
this has also been interpreted as reflecting changes in arousal level since some studies have 
found high arousal to impair immediate memory (Craik & Blankstein, 1975). However, this 
deleterious effect of arousal on immediate memory would appear to be rather less 
consistent than its beneficial effect on delayed retention. 

The differential effect of time of presentation on immediate and delayed retention 
observed by Folkard et al. (1977) thus appeared to be in line with the arousal theory. 


* Throughout this paper the term ‘immediate’ is used to refer to retention intervals of up to 15 min. This is the 
approximate delay at which the cross over between inferior immediate retention, and superior delayed retention, 
typically occurs in arousal and memory studies (see Craik & Blankstein, 1975), 
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There are, however, three problems in interpreting the results in this manner. First, 
although basal arousal level has been assumed to start increasing from about 04.00, 
detailed examination of the published studies of time of day effects in immediate memory 
using the digit span/sequence technique suggest that such memory reaches a maximum 
rather later in the morning at about 10.00 or 11.00 (see Fig. 1). Thus although the finding 
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Figure 1. The trend in immediate memory over the day as measured by the digit span technique 
(Gates, 1916a, @—@; Blake, 1967, O—O) or by the ordered recall of auditorally presented digit 
sequences (Gates, 19166, @--—@; Baddeley er al., 1970, O———0O). The scale used is the same for all 
four studies, although the absolute level differs. 


of Folkard et al. (1977) that immediate memory is superior at 09.00 to 15.00 is consistent 
with a negative relationship between such memory and basal arousal level, the available 
digit span/sequence evidence suggests that immediate memory is approximately equal at 
these times. The most probable explanation of this apparent discrepancy is that different 
processes are almost certainly involved in digit span/sequence performance to those 
involved in remembering the information presented in prose. Thus, Laird (1925) obtained a 
rather different function, that is consistent with the results of Folkard et al. (1977), when 
examining college students’ immediate memory for the ‘ideas’ presented in a short passage 
of prose. However, this appears to be the only well-designed study that has examined such 
memory over a wide range of different times of day. One of the aims of the first experiment 
was thus to attempt to replicate the findings of Laird (1925). 

Secondly, the failure of Folkard et al. (1977) to find an effect of the time of day at which 
retrieval from long-term memory takes place is somewhat surprising since a numbor of 
studies (reviewed by M. W. Eysenck, 1977) have found differences in retrieval efficiency 
associated with personality and loud noise. In addition, there is evidence that retrieval may 

. be more efficient when the subjects are in the same physiological state as when the original 
learning took place (e.g. Goodwin et al., 1969). However, ın both cases there is some 
evidence that recall measures may be more sensitive to these effects than recognition ones. 
Thus the failure of Folkard et al. (1977) to obtain either a main effect of time of day on 
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retrieval efficiency or a state-dependent effect, may have been due to the use of what was 
essentially a recognition (i.e. multiple choice) task. Expts 1 and 2 were thus designed to 
examine potential retrieval effects using recall measures. In addition, rather more extreme 
times of day were compared than those studied by Folkard et al. (1977). 

Thirdly, and perhaps most importantly, the differential effect of time of presentation on 
immediate and delayed retention observed by Folkard e! al. (1977) is open to an alternative 
explanation. Thus the decrease in immediate memory over the day could have been due to 
a build-up of extra-experimental proactive interference. Further, the superior delayed 
retention following presentation later in the day could have been due to a reduction in the 
potential for retroactive interference to occur between presentation and the subsequent 
sleep period when memories may be consolidated (e.g. Jenkins & Dallenbach, 1924; Tilley 
& Empson, 1978). However, it is possible to distinguish between these arousal and 
interference interpretations when people switch to an inverted sleep/wake schedule (i.e. on 
night work). In such a situation, basal arousal level may be assumed to decrease over the 
waking period; while proactive interference should increase and the potential for retroactive 
interference decrease. Thus the arousal and interference interpretations of the effect of time 
of presentation on immediate and delayed retention result in opposite predictions in such a 
situation. The aim of the second experiment was to test these predictions by examining 
memory for the information presented in an in-service training film shown at the beginning, 
or towards the end, of a night shift. In addition, since there is evidence that people’s 
circadian rhythms adjust to night work (e.g. Colquhoun et al., 1968), 2-hourly temperature 
readings were collected in order to examine the influence of such adjustment on immediate 
and delayed (28 day) retention. 

The aims of the two experiments described in this paper were thus (i) to ‘map out’ the 
effect of time of day on the immediate memory for information presented in prose (Expt 1), 
(ii) to determine whether time of retrieval effects may occur when recall, rather than 
recognition, measures are used at relatively extreme times of day (Expts 1 and 2), and (in) 
to distinguish between the arousal and interference interpretations of the differential effect 
of time of presentation on immediate and delayed retention (Expt 2). 


Experiment 1 
Method 


Subjects. A total of 36 undergraduates, 14 females and 22 males, with a mean age of 20-7 years 
(range = 18-27) took part in this study. Twenty of them were studying psychology but were unaware 
of the influence of time of day on performance on the type of tasks used The remaining 16 were 
non-psychologists and were paid at a rate of 70p per hour for participating. 


Design and procedure 


A cyclic Latin square design was employed, identical to that used by Folkard (1975). Each subject 
was tested at 08.00, 11.00, 14.00, 17.00, 20.00 and 23.00. Subgroups of six subjects started the 
experiment at each of these times. For one group, namely that starting at 08.00, the study was 
completed within a single day. For the remaining five subgroups, the testing spread over 2 days, with 
the subjects taking a normal night's sleep between the 23.00 test on day 1 and the 08.00 test on day 2. 
The use of this design balances out practice effects over each of the times tested. 

At each testing time, the subjects were given a different 1500-word article from the New Scientist to 
read. They were instructed to ‘read for comprehension’ but to read as much of the article as they 
reasonably could ın the 3 minutes allowed. At the end of this 3-min period they indicated how far 
they had read, and were then given a 10-1tem, multiple choice questionnaire on the article’s contents. 
In compiling the questionnaires, care was taken to ensure that the 10 questions were answerable only 
on the basis of the information contained 1n successive blocks of 150 words. All the subjects were 
given the six articles used in the same order, thereby totally confounding potential differences 
between the articles with potential ‘practice’ effects, but balancing out these potential differences over 
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the six times of day studied. The six articles were all abridged and edited to equate them in length, 
and to avoid the use of figures or diagrams. 

Having completed the multiple choice questionnaire on the passage read, the subjects were given a 
5 min category instance generation task similar to that employed by M. W. Eysenck (1974), and 
found by him to be sensitive to differences in arousal level associated with personality. Six pairs of 
categories were used, the pairs having been drawn from the Battig & Montague (1969) norms and 
chosen such that the number of common instances for each pair was approximately equal. Each pair 
of category names was printed at the top of a page with five columns of boxes beneath them. The 
subjects were instructed to write down as many instances from both categories as they could in the 
time allowed. They were allowed to switch between categories as they wished. At the end of each 
successive minute they were asked to start a new column, thus allowing separate scores to be 
obtained for each successive 1-min period. As in the case of the passages, all the subjects experienced 
the six pairs of categories in the same order, thus confounding practice with differences between pairs 
but balancing these out over time of day. Finally, the subjects’ oral temperatures were recorded using 
a standard clinical thermometer inserted sublinqually for a timed, 3-min, period. 

The experiment was run in two stages. In the first, the 20 undergraduate psychologists formed pairs 
and tested one another as part of their laboratory practical course, while in the second, the 16 
non-psychology undergraduates were tested by the authors. Since the results from these two stages 
did not differ, they were combined. 


Results and analyses 


The results from the New Scientist articles were scored in terms of (a) the number of words 
read in the 3 min allowed (speed) and (b) the number of questions correctly answered 
(corrected for guessing) expressed as a percentage of the number of questions that could 
have been answered, given how much the subject had read (memory). The category 
instance task was scored simply in terms of the total number of instances generated in the 
5-min period, since preliminary inspection of the data indicated that the results were similar 
for the five successive 1 min periods. 


Table 1. Experiment 1. The mean scores on the three performance measures, together with 
oral temperature, as a function of time of day 











Time of day 
08.00 11.00 14.00 17.00 20.00 23.00 
New Scientist articles 
Mean no. of words 716 764 770 744 762 768 
read (speed) 
Mean % correct on 59-1 513 56.6: 49-9 41-8 45-0 
multiple choice 
questionnaire — 
corrected for 
guessing (memory) 
Mean no. of category 40-4 42-8 43:2 43-0 43:2 41:9 
instances generated 
in 5 min 
Mean oral temperature 36-46 36-68 36-76 36-79 36-94 36-80 
eo 





The mean scores on these three measures of performance at each time of day are shown 
in Table 1, together with the mean temperature readings. (The temperature readings are 
based on an N of 33 since 3 subjects in stage 1 of the experiment failed to produce 
complete temperature records.) Latin-square repeated-measure design analyses of variance 
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(Winer, 1970, p. 539, plan 5) were used to analyse the three performance measures. These 
indicated that there was no significant effect of time of day on either the number of words 
read (F = 1-75, d.f. = 5, 150, P > 0-10) or the number of category instances generated 
(F < 1, d.f. = 5, 150, P > 0-25). However, the analysis of the immediate memory scores (i.e. 
percentage correct), based on an arc sine transformation, indicated that there was a 
significant effect of time of day on this measure (F = 2-65, d.f. = 5, 150, P < 0-05). There 
was no evidence of any asymmetric transfer effect, as judged by the ‘time of day by 
practice’ interaction, in any of these performance measures (F < 1, d.f. = 20, 150, P > 0-25, 
in all three cases). Finally, there was a significant time of day effect in the temperature 
scores (F = 14-15, d.f. = 5, 160, P < 0-001). 

In order to compare the immediate memory results with those of Laird (1925), for both 
studies the mean percentage correct at each time of day was expressed as a percentage of 
the overall mean. The results are shown in Fig. 2, together with the temperature readings 
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Figure 2. The trend in immediate memory over the day for the information presented in prose for 
Expt 1 (@—@) and after Laird (1925) (@---@) For comparative purposes the readings have been 
expressed as a percentage of the overall mean for each study. The trend in oral temperature from 
Expt | is shown (© ~ O). 


from the present experiment. Two main-points emerge from inspection of this figure. First, 
despite the use of rather different recall measures, the immediate memory results from the 
present study show a highly similar time of day effect, in terms of both the magnitude and 
the shape of the trend, to that found by Laird (1925). Secondly, immediate memory showed 
a mirror image trend to that in oral temperature with the sole exception of the 14.00 
reading. 

Given that Colquhoun (1971) argues that the parallelism between basal arousal level and 
the circadian rhythm in body temperature breaks down for the period immediately after 
lunch, the immediate memory results are thus highly consistent with the view that 
variations in basal arousal level over the day are negatively related to variations in 
immediate memory. Indeed, despite the fact that temperature readings do not reflect the 
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presumed ‘post-lunch dip’ in basal arousal, the trends over the day in oral temperature 
and immediate memory showed a significant negative correlation (Kendall’s W = —0-87, 
P = 0-017). It should, however, be noted that neither the improvement in immediate 
memory from 11.00 to 14.00, nor that from 20.00 to 23.00 was statistically significant. 


Discussion 

Two main conclusions can be drawn from the results of this experiment. First, it would 
appear that the immediate memory for information presented in prose shows a rather 
different trend over the day to that for performance on digit span/sequence tasks. Thus the 
superiority in immediate memory at 09.00 compared with 15.00 found by Folkard et al. 
(1977) would indeed appear to be attributable to the nature of the task used. The 10 per 
cent difference found by Folkard et al. (1977) between the immediate memory scores at 
these times is very similar to the 9 per cent difference predicted by the averaged 
extrapolation of the results of the present experiment and those of Laird (1925). Further, 
the trend relating immediate memory for information presented in prose is exactly that 
which would be predicted by the arousal theory. 

The second conclusion to be drawn from the results of Expt 1 is that time of day would 
appear to have little effect on people’s ability to retrieve information from memory, at least 
as measured by the category instance generation task used in this study. While this is 
consistent with the failure of Folkard et al. (1977) to find such an effect in the delayed 
retention of prose, it is disappointing in view of the reliable results that have been obtained 
with other arousal manipulations using the present task (reviewed by M. W. Eysenck, 
1977). It is of course, possible that an effect may have been found if the rather more 
sensitive technique of examining recall or recognition latencies had been used. In addition, 
as indicated earlier, there is a second way in which time of day may affect retrieval 
efficiency. Thus retrieval may be more efficient at the same time of day as the material was 
originally learned, i.e. there may be a state-dependent effect of time of day. Again, Folkard 
et al. (1977) failed to find such an effect but argued (p. 49) that one might be found if recall 
rather than recognition measures were used, and more extreme times of day (e.g. 20.00 and 
04.00) were compared. 


Experiment 2 

This experiment was designed (i) to distinguish between the arousal and interference 
interpretations of the effects of time of presentation on immediate and delayed retention 
and (ii) to examine whether time of day may affect retrieval efficiency in a state-dependent 
manner when extreme times of day are compared. The experiment formed part of a large 
shift work study, the other aspects of which have been reported elsewhere (Folkard et al., 
1978, 1979). In the present context the important aspect of the study is that concerned with 
night nurses’ immediate and delayed (28 day) retention of the information presented in an 
‘in-service’ training film shown at 20.30 or 04.00. 

In terms of the arousal theory (Colquhoun, 1971) outlined in the introduction, the times 
at which the material was presented correspond very closely to the maximum and 
minimum levels of basal arousal level due to circadian variation. Thus the arousal theory 
clearly predicts superior immediate retention at 04.00, but superior delayed retention 
following presentation at 20.30. 

Yn contrast, as indicated in the introduction, the interference theory makes the reverse 
prediction. There is greater potential for proactive interference at 04.00 than at 20.30, since 
the nurses would have been awake for longer, and thus immediate memory should be 
superior at 20.30. However, the potential for retroactive interference between presentation 
and the subsequent sleep period (which on average began at about 11.00) is clearly reduced 
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following presentation at 04.00 and therefore delayed retention should be superior 
following presentation at 04.00. The arousal and interference theories thus make opposite 
predictions for both the immediate and the delayed retention scores. However, this is only 
true if the circadian rhythm in basal arousal level shows no adjustment to the inversion of 
the sleep/wake cycle. In view of this, all the nurses were shown the film on the first or 
second night of a period of successive night shifts in an attempt to minimize the level of 
adjustment to night work (see Folkard et al., 1978). In addition, 2-hourly temperature 
readings were taken from the nurses over the night shift on which they were shown the film 
so that the influence of level of adjustment on the memory scores could be examined. 

The second aim of this study was to look for state-dependent effects in the 28-day 
delayed retention scores. Approximately half the nurses shown the film at 20.30 were given 
the delayed test at the same time (i.e. 20.30) 28 days later, while the other half were given it 
at 04.00. In addition, the majority of the questions in each of the two parallel 
questionnaires used were of an ‘open-ended’ nature requiring a single word or short phrase 
answer. Thus extreme times of day were compared using mainly recall, rather than 
recognition, measures. The study would thus appear to be of the optimal design to reveal 
any state-dependent effects associated with time of day. 


Method 


Subjects. A total of 50 female nurses from Northwick Park Hospital took part in this experiment. 
Twenty-two of them were full-time night staff, who typically worked four nights a week and had a 
mean age of 33-4 years Twenty-five were part-time night staff (mean age = 38-2 years) who normally 
worked two nights a week, but who never worked on the day shifts. The remaining three were 
student nurses (mean age = 24 years), who had had httle experience of night work and have thus 
been included in the part-time group 1n this report. 


Design and procedure 


All the nurses were shown the same 10 min film on the use of radium therapy as part of their 
‘in-service’ training programme. Approximately half the nurses were shown the film at 20.30, the 
beginning of the night shift. The remaining nurses were shown it at 04.00. The film was shown to the 
nurses in small groups that varied in size from two to five. These groups were shown the film on 
different days in order to avoid undermanning of the wards; however, the film was always shown on 
either the first or second night of any individual nurse's period of successive night shifts. 
Approximately half the nurses were shown the film on their first night shift, and half on their 
second, this factor being approximately balanced over time of presentation. Immediately after seeing 
the film the nurses were required to complete one of two parallel questionnaires on the film. Each of 
these questionnaires consisted of 15 open-ended questions, requiring a single word or short phrase 
answer, and 5 multiple (4) choice questions. 

The nurses completed the second questionnaire 28 days later at either the same or ‘different’ time 
to that at which they had originally seen the film. This factor was approximately balanced over the 
other factors described above, while the order in which the nurses were given the two parallel 
questionnaires was also counterbalanced across the various factors. The second questionnaire was 
completed on the same night (i.e. first or second) of a period of successive night shifts as the original 
presentation had been given, in order to ensure that as far as possible each nurse's circadian rhythms 
were at the same level of adjustment to night work for both presentation and delayed recall. 


Results and analyses 


Preliminary analysis of the results indicated that there was no evidence that the open-ended 
and multiple choice questions differed in their sensitivity to the potential effects of either 
state-dependency or time of presentation. All the analyses reported were, therefore, based 
on the total number of questions correctly answered. 
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Time of presentation. The overall effects of time of presentation on immediate and delayed 
(28 days) retention are shown in Table 2. In this table the delayed retention scores have 
been summed over the two times at which recall took place. A mixed design, least-squares 
solution, analysis of variance indicated that there was no significant overall effect of time of 
presentation (F = 1-94, d.f. = 1, 48, P > 0-10), but a highly significant effect of delay 

(F = 115-9, d.f. = 1, 48, P < 0-001) and a significant interaction between time of 
presentation and delay (F = 7-94, d.f. = 1, 48, P < 0-01). Independent 1 tests indicated that 
time of presentation had no significant effect on the immediate recall scores (t « 1), but that 
delayed recall was significantly higher following presentation at 20.30 (t — 2-45, d.f. — 48, 

P « 0:01). 


Table 2. The mean number of questions correctly answered in the immediate and delayed 
(28-day) retention test following presentation at 20.30 or 04.00 








Recall 
Time of Delayed 
presentation Immediate (28 days) 
20 30 11 81 9-31 
(n = 26) 
04 00 11-63 7-38 
(n = 24) 


Effects of level of adjustment. In order to examine the influence of level of adjustment to 
night work on the effect of time of presentation, the slope of the best fitting straight line to 
each nurse’s 2-hourly temperature readings (from 20.00 to 08.00) for the night shift on 
which she was shown the film was taken as an index of adjustment. (The data from two 
nurses, both of whom had been shown the film at 20.30, had to be omitted from these 
analyses in view of missing, or abnormally high, temperature readings that resulted in 
spurious adjustment scores.) The two main groups of nurses, i.e. those shown the film at 
20.30 and those shown it at 04.00, were then subdivided into two equal size groups on the 
basis of this adjustment index. 


Table 3. The mean number of questions correctly answered in the 20.30 and 04.00 
immediate tests for the nurses showing poor or good adjustment of their temperature 
rhythm to night work 


Level of adjustment 








Time of 

presentation Poor Good 
20.30 10-83 12-58 
04.00 12-75 10-50 





The mean immediate recall scores for these four subgroups are shown in Table 3. An 
analysis of variance indicated that level of adjustment interacted significantly with time of 
presentation (F — 7-29, d.f. — 1, 44, P « 0-01). Within the groups showing poor adjustment 
of their temperature rhythm, immediate memory was superior at 04.00 than at 20.30. 
Conversely, within the groups showing good adjustment of their temperature rhythm 
immediate memory was higher at 20.30 than at 04.00. 
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Table 4. The percentage retained scores in the delayed test, following presentation at 20.30 
or 04.00 for nurses showing poor or good adjustment of their temperature rhythm to night 
work 


Level of adjustment 














Time of 

presentation Poor Good 
20.30 78:37; 81:87; 
04.00 65:2% 60-4% 


Level of adjustment 


Poor Good 


Temperature (°C) 





Immediate recall 


No correct (out of 20) 


Delayed retention 


% remembered 





20 22 24 02 04 06 08 20 22 24 02 04 06 08 
' Time of day 


Figure 3. The effect of time of *day' on oral temperature, immediate memory, and delayed retention 
(per cent retained) shown separately for the nurses showing ‘poor’ or ‘good’ adjustment of their 
circadian rhythm in oral temperature to night work. 


In order to control for these differences in the immediate recall scores, each nurse's 
delayed memory score was expressed as a percentage of her immediate one. The means of 
these ‘percentage remembered’ scores are shown in Table 4. Unlike immediate recall, the 
superiority of delayed retention following presentation at 20.30 appeared to be relatively 
unaffected by the level of adjustment of the nurses’ temperature rhythms. An analysis of 
variance confirmed that there was no significant interaction between level of adjustment 
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and time of presentation (F « 1), although there was a significant main effect of time of 
presentation (F — 9-03, d.f. — 1, 44, P « 0-01). A parallel analysis based on the difference 
between the immediate and delayed retention scores showed a similar main effect of time of 
presentation (F = 7-85, d.f. = 1, 44, P < 0-01) and confirmed the absence of a significant 
interaction between level of adjustment and time of presentation (F « 1). 

The degree to which the nurses' circadian rhythms in oral temperature were adjusted to 
night work when they saw the film would thus appear to have influenced the effect of time 
of presentation on immediate memory, but not that on delayed retention. This differential 
effect of physiological adjustment on the memory scores can be seen quite clearly in Fig. 3. 
In this figure the mean trends in oral temperature over the night on which the film was 
shown are also plotted. Inspection of this figure suggests that immediate memory may 
adjust more quickly to night work than oral temperature. Thus, although the degree of 
adjustment in temperature of the ‘good adjusters’ was far from complete, there sppe to 
be a complete inversion of the time of presentation effect on immediate memory. 

Finally, it should be noted that the majority (18 out of 24) of the nurses showing poor 
temperature adjustment were part-time staff or students. This is not surprising in view of 
the finding that full-timers show greater long-term adjustment to night work than 
part-timers (Folkard et al., 1978a). However, it is unlikely that the performance differences 
observed can be attributed to differences in the overall abilities of the full-and part-time 
staff, since this bias was exactly the same at both times of presentation. 


Time of retrieval. Since the delayed retention scores were unaffected by the nurses’ level of 
temperature adjustment, the results from all 50 nurses were combined to examine the 
potential effects of the time of retrieval. As above, these analyses were based on the 
‘percentage retained’ scores. The results are shown in Table 5. There was no evidence that 


Table 5. The mean percentage retained scores in the delayed test as a function of (a) the 
time at which the delayed test was actually given, and (b) whether the delayed test was 
given at the ‘same’ or ‘different’ tıme to the original presentation 








Time of delayed test Presentation/retrieval time 

20.30 04.00 ‘same’ ‘different’ 
68:8% 739% 69:3% 73-1% 

(n = 25) (n = 25) (n = 23) (n = 27) 


retrieval at the ‘same’ time of day as the original presentation had occurred was superior to 
that at the ‘different’ time of day. Indeed, the mean levels showed the opposite trend 
although this was not significant (t < 1). Nor was there any evidence that time of retrieval 
per se affected the delayed retention scores (t « 1). Again, parallel ¢ tests based on the 
difference between the immediate and delayed retention scores yielded similar results (1 « 1 
in both cases). ; 
Discussion 

The results of this experiment clearly support an arousal, rather than an interference, 
interpretation of the effects of time of presentation both on immediate recall and on 
delayed retention. The interference theory clearly predicts that immediate memory should 
be less affected by proactive interference at 20.30 than at 04.00, irrespective of the nurses' 
level of adjustment to night work. In contrast, the arousal theory only predicts immediate 
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memory to be superior at 04-00 if there is no adjustment of the circadian rhythm in basal 
arousal level, and can thus account for the interaction between level of adjustment and time 
of presentation in the immediate memory scores. 

However, the magnitude of the 04.00 superiority in immediate memory in the poorly 
adjusted group was somewhat smaller (about 18 per cent) than that which might be 
predicted from the results of Expt 1, and those of Laird (1925), where the difference 
between 08.00 and 20.00 was of the order of 30 per cent. There would appear to be two 
alternative explanations of this discrepancy. First, it is possible that the parallelism between 
oral temperature and basal arousal level (Colquhoun, 1971) may break down during the 
night when people are normally asleep, such that although oral temperature normally 
reaches a minimum at 04.00, basal arousal level may reach its minimum somewhat later. 
Secondly, the results could be due to partial adjustment of the circadian rhythm in basal 
arousal level even in the groups showing poor adjustment of their temperature rhythm to 
night work. 

The superior delayed retention following presentation at 20.30 is also inconsistent with 
an interpretation in terms of interference theory, since the potential for retroactive 
interference between presentation and the subsequent sleep period was clearly greater at 
this time. However, the finding that this 20.30 superiority was unaffected by the nurses’ 
level of adjustment to night work is also at odds with an interpretation in terms of a 
unitary circadian rhythm in arousal such as that proposed by Colquhoun (1971). The 
present results suggest that the circadian factor, or factors, underlying time of day effects in 
immediate memory adjusts to night work more quickly than that, or those, responsible for 
the effects of time of presentation on delayed retention; obviously ‘basal arousal’ level can 
only adjust at a single rate. A similar conclusion can be drawn from other studies where 
performance on tasks involving a high working memory load has been found to adjust to a 
shift in the sleep/wake cycle more quickly than either body temperature, or performance on 
more simple tasks (Hughes & Folkard, 1976; Monk et al., 1978). Clearly, if these results 
are to be interpreted within an arousal framework, it is necessary to adopt a more complex, 
multifactor theory of arousal, such as that proposed by Broadbent (1971), with the 
circadian rhythm in the different factors adjusting to night work at different rates. Indeed, 
this difference in the rate of adjustment suggests that the circadian factor(s) responsible for 
the time of presentation effect on immediate memory may be relatively exogenous, while 
that, or those, responsible for delayed retention may be endogenous. 

It is unclear whether these different circadian factors responsible for the time of 
presentation effect on immediate and delayed memory are normally in phase with one 
another and only dissociate as a result of a shift in the sleep/wake cycle, as in the case for 
the secretion of adrenalin and noradrenalin (Akerstedt & Levi, 1978); or whether they are 
normally out of phase like the circadian rhythms in body temperature and in the secretion 
of cortisol (Conroy & Mills, 1970). If the latter is the case, then the time of presentation 
effect on delayed retention might be expected to differ from the mirror image trend to that 
on immediate recall predicted by the unitary arousal theory. To date, no study of both the 
immediate and delayed retention of this kind of material has compared more than two 
times of presentation. Thus, although an interaction between time of presentation and 
delay of retention has consistently been found, this cannot be taken as proof for mirror 
image trends over the day. 

Finally, the results of this experiment support those of Expt 1 and of Folkard et al. 
(1977) in failing to show any effect of time of day on retrieval efficiency, despite the 
comparison of extreme times of day and the use of recall rather than recognition measures. 
This failure was common to both the potential effect of time of retrieval per se, and to 
potential state-dependent effects mediated by the differing physiological states known to 
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exist at different times of day. Although this failure to find a state-dependent effect is 
somewhat disappointing, the authors are unaware of any study that has found such an 
effect with naturally occurring variations in physiological state. Thus state-dependent 
effects may be limited to situations where people are presented with information while in 
an abnormal physiological state due to the administration of, for example, drugs (e.g. 
Goodwin et al., 1969). 

The failure to find a main effect of time of retrieval is somewhat more surprising in view 
of the fact that such effects have been observed for differences in personality type and with 
the administration of loud noise (M. W. Eysenck, 1977). However, these effects may be 
mediated by changes on different dimensions of arousal, such as anxiety, to those 
responsible for time of day effects in performance. Thus, although the effects of both 
personality and noise have been found to interact with those of time of day (Blake, 1971), 


this cannot be taken as evidence that they affect the same dimension or factor of a 
multifactor arousal system. 


Conclusions 


Despite the uncertainties raised by the results of Expt 2 as to the adequacy of the 
unidimensional arousal theory in accounting for time of day effects, a number of 
conclusions can be drawn from the present studies. First, it would appear that immediate 
memory for information presented in prose or other naturalistic contexts varies fairly 
substantially (+15 per cent) over the waking day. The nature of this variation is exactly 
that which would be predicted by the unidimensional! arousal theory if it is assumed that 
increases in arousal impair such memory. In contrast, the available evidence suggests that 
this variation cannot be accounted for in terms of increasing proactive interference over the 
day. Second, these effects of time of presentation would appear to be due to changes in 

the encoding/storage processes, rather than to changes in retrieval efficiency. To date, there 
is no evidence that retrieval efficiency varies with time of day per se, or in a state-dependent 
manner. 

Third, and perhaps most important, the time of day at which material is presented 
would appear to have a sizeable effect on the delayed retention of ıt. While again, the 
available evidence suggests that this effect is not mediated by the potential for retroactive 
interference between presentation and the subsequent sleep period, the unidimensional 
arousal theory also has difficulty in accounting for this effect. Nevertheless, this effect of 
time of presentation on delayed retention is probably attributable to circadian variations in 
some, as yet unknown, factor(s) that differs from that responsible for the immediate 
memory results. Although the available evidence is consistent with the view that the effect 
of time of presentation on delayed retention normally shows a mirror image trend over the 
day to that on immediate memory, this has yet to be proven. Finally, in view of the 
practical implications of the effect of time of presentation on delayed retention, there is a 
clear need for research on the nature of the changes in encoding/storage processes that 
mediate these effects. The authors believe that such research may be profitably pursued 
despite the complications encountered in attempting to isolate the underlying circadian 
factors responsible for them. 
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Erratum 


Kevin Wheldall and Barbara Poborca (1980). Conservation without conversation? An 
alternative non-verbal paradigm for assessing conservation of hquid quantity. British 
Journal of Psychology, 71, 117-134. 


p. 129, last line/p. 130 first line 


Jor ‘Moreover, we have demonstrated that a// children who conserved non-verbally could 
also conserve verbally.’ 


read ‘Moreover, we have demonstrated that all children who conserved verbally could also 
conserve non-verbally.' 
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Book reviews 


Drugs and the Inheritance of Behaviour: A Survey of Comparative Psychopharmacogenetics. By P. L 
Broadhurst. New York & London: Plenum. 1978. Pp. vii+ 206. 


It 1s difficult enough to survey a dual hybrid science such as psychopharmacology or behavior(al) 
genetics. When one deals with the interrelations of three sciences, each with its own specialized 
vocabulary, techniques and intellectual traditions, the difficulties are compounded. Peter Broadhurst 
has accepted the challenge and, on the whole, has been successful. The only other volume 
(Eleftheriou, 1975) that attempts a general survey of psychopharmacology is multi-authored. As in 
most such compilations each author concentrates on his own research, and general syntheses and 
critical evaluation of hypotheses is deficient. 

Broadhurst chose to organize his material according to the genetical techniques used in research. 
sex differences, pharmacogenetic selection, strain differences, diallel crosses, and recombinant inbred 
strains (RIS). Since so much research has been done on strain differences it has been divided among 
three chapters, each dealing with a different class of drugs. Were I the author, I would probably have 
based my primary organization on drug actions (that 1s, on varieties of behavioural phenotypes) and 
then described how various genetical techniques contribute to our knowledge of variation in 
sensitivity to, and effects of, each class of drug. But organization is a matter of taste, and tastes differ. 
The book ıs well indexed, and one can locate desired material by name of drug, species, strain, or 
type of behaviour affected. 

The literature on experimental psychopharmacogenetics through 1977 is well covered. Important 
papers are described in considerable detail and evaluated critically Broadhurst’s own predilection for 
the biometric approach to behaviour-genetic analysis is evident, and he expresses doubts concerning 
the interpretation of some other approaches — particularly the postulation from RIS studies that many 
differences in the effects of drugs on behaviour are functions of single-locus allelic substitutions. I 
happen to share his scepticism, but others might call us biased. 

The attempt to condense descriptions of complex experiments into small spaces leads occasionally 
to long, involved sentences that are hard to follow To ease matters somewhat, Broadhurst included 
17 summarizing tables that are very helpful in obtaining an overview of research on a particular 
subject. f 

In his final chapter Broadhurst makes a plea for more sophisticated forms of genetic analysis. He 
expresses disappointment in the heavy concentration on differences between selected lines and inbred 
strains. It seems to be easy to find such differences, but the search for strain variability for its own 
sake does not enhance genetic knowledge very much. This is a good point, but many (perhaps most) 
researchers in psychopharmacogenetics are primarily interested in using genetic variation in animals 
as a model for interpreting individual variability in response to drugs by humans. Certainly, such 
considerations have shaped genetic research on the consumption of alcohol by animals, and on 
variable sensitivity to injected drugs. Would more detailed biometrical analysis be helpful to a 
pharmacologist who was trying to correlate such strain differences with neurotransmitter levels in the 
brain or with the rate of alcohol metabolism? 

A reader needs a background in the vocabulary and basic principles of all three component 
sciences to obtain maximum benefit from this book. It does not deal explicitly with such topics as the 
principles of genetic selection, how to analyse a diallel cross, how to measure emotionality, the action 
of specific drugs on the release and uptake of catecholamines. But such a massive volume would be 
less useful than the data-packed, critical monograph that Professor Broadhurst has written. His book 
is a most valuable addition to the library of those who are already interested in one of the dual 
hybrid sciences: behaviour genetics, psychopharmacology, pharmacogenetics. In its pages one finds 
an up-to-date summary of the field, and constructive ideas for its future development. 

JOHN L. FULLER 


Criminology in Focus: Past Trends and Future Prospects. By A. Keith Bottomley. Oxford: Martin 
Robertson. 1979. Pp. 181. Cased, £8.95, paper, £3.50. 


The four topics discussed in this book are: definitions of crime, the search for causes, criminal justice 
and ‘toward a rehabilitation of punishment’. The second and fourth of these will be of most concern 
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to psychologists, although they will be interested 1n the discussion of the various attempts to broaden 
the concept of crime beyond the legal definition and the areas of discretion that have been identified 
in the operation of the criminal justice system. 

In his discussion of the many attempts to discover the cause of crime, Bottomley points out that 
the great majority have been concerned with attempts to explain why, or to predict which, particular 
individuals rather than others will commit offences, and not with explaining why criminal activity 
occurs The argument is rather convoluted and at times has the compressed flavour of teaching notes, 
but he comes to the conclusion that no satisfactory explanation has yet been advanced which meets 
the criterion that ‘the conditions which are said to cause crime should always be present when crime 
is present and they should be absent when crime is absent’. He suggests that causative explanations in 
criminology (and in the other social sciences, although he is dubious whether criminology ought to be 
one) cannot of their nature be tested in terms of their predictive accuracy but only in terms of their 
‘plausibility and rendering intelligible the behaviour in question’. He suggests that an explanatory 
framework which includes an analysis of the objective crime situation, investigation of the 
background factors in the individual or social group, and close attention to the subjective meaning 
attached to the situation by the actor ‘would provide a satisfying [plausible?] explanation of specific 
criminal acts and recognize the essential unpredictability of so much crime? This is rather like the 
framework suggested by the psychologist Clarke (1977) with an interest in practical matters of crime 
prevention rather than theory building. That two criminologists from different parent disciplines, and 
adducing rather different types of evidence, arrive at substantially the same position suggests a high 
level of validity. 

Bottomley deals with a very topical issue in his attempt to rehabilitate punishment. He notes that, 
until very recently, the dominant trend in 20th century penology has been toward replacing 
punishment with individual rehabilitation. Attempts to demonstrate the effectiveness of one or other 
rehabilitative method have probably taken up most of the energies of most psychologists working 1n 
this field during the past several years and there 1s by now general, albeit reluctant, acceptance of the 
evidence that such attempts, whether using psychotherapeutic treatment modes or behaviour 
modification techniques, have failed to reduce individual recidivism rates significantly. As a result, 
and because the volume of crime continues to increase, there has been a swing towards the ‘just 
desserts' justice model This involves consideration of the use of custodial sentences as a means of 
individual and general deterrence and for the containment of those offenders who are considered 
likely to continue to be responsible for the most serious crimes. 

He attempts to rehabilitate punishment, and ensure humane custodial régimes, by suggesting that 
prison sentences should be regarded as periods of social disqualification providing the opportunity 
to requalify the offender. He does not suggest specific methods of requalification but argues that the 
concept can be extended to include reparation and restitution not only to the specific victim but to 
society as a whole. 

This book 18 relatively free from technical sociological language and provides a thoughtful 
summary of the current concerns of academic criminology and of the philosophical issues involved in 
the implementation of criminal legislation. It should, as the author intended, enable psychologists 
with an interest in criminology to share some of the concerns of their sociological colleagues. 

JOY MOTT 


Entering the World of Number. By R. T. Green & V. J. Laxon. London: Thames & Hudson. 1978. 
Pp. 180. £4.95. 


This short book has the laudable aim of making recent advances in the psychological understanding 
of how children develop early number concepts available in a digestible form to parents of pre-school 
children. In my view, unfortunately, it fails lamentably to reach this goal for a number of reasons. 

Firstly, it starts from the standpoint of psychologists with something to sell rather than from the 
viewpoint of parents who wish to help their own children. Thus, a sequence of carefully constructed 
learning opportunities (hideously entitled, LOPs) are described in detail. Most psychologists will 
appreciate the parallel between these steps and neo-Piagetion theories of concept development and, 
therefore, will see where the exercises lead. Parents may be puzzled as to the link between the 
exercises and arithmetic skills. They will remain puzzled if they only read this text. 

Secondly, as with all attempts to sell (or popularize) psychological findings, the authors aim to use 
no Jargon. Immediately, they not only foist * LOPs' upon us, but also PLAYTHINKS' which 
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appears to be the registered trademark of play material they have developed. Parents are next 
introduced to ‘subitizing’, and I am still not sure if I need to understand that one! 

The writing has many examples of the sort of throwaway humour which either appeals or irritates, 
but threatens to come between the reader and the text. Whilst the authors point to the many 
linguistic problems we face in getting across concepts of number, somehow they confuse more than 
they clarify. Parents may well pick up many handy hints, but it is doubtful if many will persevere with 
the daunting text. Even those who do will find that the sort of apparatus their children will meet in 
infant school — Cuisenaire, Stern and Dienes — is mentioned only briefly in one paragraph on p. 116. 
This 1s scarcely the way to link home-based learning with school-based teaching. A pity, because 
many parents would welcome a text which achieved the aims of this book. 

WILLIAM YULE 


Intelligence: Heredity and Environment. By P. E Vernon. Reading: W. H. Freeman. 1979. Cased, 
£10.10; paper £5.30 


Professor Vernon tells us that this is probably the last new book he will write, although he still hopes 
to revise earlier ones. He expresses the hope that 1f it does anything to persuade readers that both 
genetic and environmental factors are important, and that intelligence testing, if looked at in this 
light, still has a major role to play in psychological theory and educational practice, he will regard it 
as the culmination of over 50 years spent in the field of mental measurement. We believe that this 
hope wili be fulfilled and that this well-researched and clearly written text provides an important and 
balanced summary of a very wide range of scientific literature. 

We express this confidence despite a major flaw which the author belatedly recognized, namely that 
the work of the late Sir Cyn! Burt is, to an unknown extent, riddled with fraud. It should have been 
unnecessary for the book to be prefaced by a late inserted ‘Important note to the reader’ indicating 
that parts of ch. 11 require drastic revision. Had Vernon closely scrutinized Burt's seminal articles on 
genetic influences in the determination of intellectual differences, and perhaps had he been less closely 
associated with Professor A. R. Jensen, he would have accepted the strong evidence for fraud first 
made public in 1976, and later elaborated. Moreover, it is not only ch. 11 which requires revision, 
but also parts of ch. 12, 15 and elsewhere. Unfortunately the inclusion of any reference to the work of 
Burt, and his imaginary collaborators Howard & Conway, may well serve to perpetuate recognition 
of the papers which must as soon as possible be deleted from all texts and all tables, together with 
any conclusions by other authors based on these data. For example, on p. 290 we find Jensen's 
argument that in only 5 per cent of recorded MZA twins did the intra-pair IQ differences exceed 15 
points. This calculation is based on all four studies reporting on a total of 122 MZA pairs, of which 
Burt's 53 now have been written off. 

The book is divided into four parts: the nature of intelligence; child development and 
environmental effects on intelligence; genetic 1nfluences on individual differences in intelligence; and 
genetic influences on group differences. There are no fewer than 52 pages of references, which could 
be of immense value to research workers. Each of the 21 chapters is followed by an excellent 
summary. 

In tracing the history of intelligence testing, Vernon indicates how the issue became increasingly 
politicized after the disastrous failures of Head Start, and Jensen's attempted explanation for this. He 
outlines very fairly the criticisms of intelligence tests, and repeats some of his important statements of 
a decade ago that ‘the notion of intelligence as a cause for good or poor achievement must be 
discarded. . .the distinction between them derives mainly from the greater generality of intellectual 
skills and their lesser dependence on deliberate teaching’. 

Inevitably, in a book of this length which attempts to cover one of the most complex problems in 
human rescarch, some aspects are less well handled than others. For example, the chapter on the 
effects of prenatal, perinatal and other constitutional factors, while wide ranging, is a little sketchy. 
Surprisingly, the important review by Sameroff & Chandler (1975) 1s not mentioned. Similarly, the 
chapter on socio-economic advantage and disadvantage is all too brief, and here the author has failed 
to pick up both Rutter & Madge's (1976) Cycles of Disadvantage and the various reports of the 
National Child Development Study, although he quotes some of these latter elsewhere in the book 

The author's summary of Bernstein's contribution, as well as of his critics, particularly Labov and 
Ginsberg, seems to us somewhat superficial. More serious in a book with this title, the chapter on 
foster studies is weak. It 1s of course unfortunate that neither the Texas Adoption Project (1979) nor 
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Scarr & Weinberg's (1978) fairly recent study of older adopted children could ‘be extensively reviewed. 
However, the transracial adoption study of these latter authors, referred to in; ch. 19, contains 
correlational data for parents and their adopted and biological children which would have been 
germane to some of the discussion 1n ch. 14. Vernon, unfortunately, has again found space for some 
of Burt's dubious ‘findings’! 

Vernon gives a sober account of work on ethnic differences. His Table 21.1 summarizes this 
evidence. Out of 30 areas of research, five favour genetic factors, a further four possibly do, two 
favour environmental causes and a further six possibly do. For the remainder the verdict is ‘either’ or 
‘neither’. 

The author makes a brave attempt at reconciling the conflicting claims of hereditarians and 
environmentalists, and mentions every relevant area of work, even if sometimes only in passing. It is, 
however, like many of the best books in a rapidly developing subject, already a little out of date. 
Thus Vernon missed the publication of Fifteen Thousand Hours by Rutter et al. (1979) and the two 
major adoption studies already mentioned. 

This review may perhaps appear to have concentrated upon the book’s weaknesses; we feel that its 
strengths clearly outweigh these. No other volume contains so much material on this subject Its very 
wide-ranging coverage, its middle-of-the-road viewpoint that both. genetic and environmental factors 
are important in individual differences, reflect the present state of knowledge; extremists of either: 
persuasion will derive little comfort from it. As a source of reference and of information it will 
become invaluable reading for anyone interested in this field. It is fully within Vernon's tradition of 
providing timely and fairminded overviews of complex psychological problems. 

ANN M. CLARKE and A. D. B. CLARKE 


Search for Harry Price. By Trevor H. Hall. London: Duckworth. 1978. Pp. 237. £7.95. 


Harry Price, described as having been the foremost psychic journalist of his generation, is best known 
today for his involvement with Borley Rectory, about which he wrote in his books, The Most 
Haunted House in England (1926) and The End of Borley Rectory (1946). Borley received extensive 
publicity in newspaper articles and through the BBC, until in 1956 it was made clear in a report by 
Dr Trevor H. Hall, in conjunction with Dr E. J. Dingwall and Mrs K. M. Goldney, that the whole 
affair was a fraud rigged up by Harry Price. 

Today, we are told, history is almost repeating itself. The haunting of Borley is again quoted as 
providing evidence for psychical events and a resurgence of popular belief in the Borley legend is 
being boosted by press and television In order to combat this, Dr Hall has produced this exhaustive 
scrutiny of Harry Price in which he gives detailed evidence to show that in almost everything he did 
or said Price was a liar and a fraud. 

The book reads like a detective story — or collection of short stories — and it is significant that Dr 
Hall has also written, with distinction, about Sherlock Holmes. First a comparison is made between 
Price's own account of his family background and early life, and facts as revealed from public records 
and other sources. The later chapters contain findings from a series of investigations into incidents in 
Price's career. 

At the age of 15 Harry Price founded the Carlton Dramatic Society in order to dramatize 
experiences he had during a visit as a child to an old manor house in a Shropshire village. Price kept 
the identity of the house secret, but referred to it as ‘Parton Magna’. It provided the setting for an 
article he wrote in 1926. ‘A strange experience with a Shropshire poltergeist’ ‘published i in the Journal 
of the American Society for Psychical Research. Price claimed that the house was hauntéd by the 
spirit of a young girl, Mary Hulse, who had drowned herself in a nearby river in the early seventeenth 
century. Further enlargements of the story appeared in Confessions of a Ghost Hunter and im ' 
Poitergeist over England. 

In a chapter ‘Search for Parton Magna’, Dr Hall describes how from the available clues, some of a 
contradictory nature, he was able to locate the house and establish its connection with Price. It 
emerges after some astute detective work that Price’s uncle had resided in the house for a short time 
and that Price had been taken there as a child by his parents Mary Hulse was the name of the first 
child of Price's grandfather and she had died of natural causes in 1835. 

Other chapters cover a wide range of topics ranging from Price's excursion into archaeology and 
numismology to magic, conjuring and heraldry. An account is given of ‘The ndiculous Bloksburg 
tryst affair’ in which Price collaborated with C. E. M. Joad, at that time Head of the Department of 
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Philosophy and Psychology at Birkbeck College. The two of them set off m 1932 for the Hartz 
Mountains to conduct an experiment on the summit of the Brocken, where in the presence of a 
‘maiden pure in heart’, a ‘virgin he-goat’ was to be converted through various magical rites into a 
beautiful youth. In the presence of an army of photographers, pressmen and a camera crew, Price 
uttered his incantations but the goat stubbornly refused to become human - even after having a 
bottle of wine poured over him. 

We learn that Joad became Chairman of The University of London Council for Psychical 
Investigation, members of which included Cyril Burt and J. C. Flugel. This organization, which had 
no offical recognition from the university, replaced The National Laboratory of Psychical Research 
established by Price in 1926. Details are also given of Price’s attempts to set up, and endow, a 
department of Psychical Research at London University. 

In addition to revealing the facts about Harry Price, Dr Hall brings together information about 
activities around the occult fringe of psychology in the 1930s He shows how a detailed and 
penetrating analysis of writings and events can eventually untangle fact from fiction. Each chapter in 
this book makes fascinating reading, both as an account of the inventions and fantasies of a complex 
and devious individual and as an example of detective work of great skill. 

Dr Hail quotes the prophetic words of a reviewer of the original report on Borley who wrote in the 
Economist, ‘It will take more than this antidote to counter so massive and thoroughly assimilated a 
dose of cope'. His present book may provide the extra evidence required to dissuade those who are 
seeking to resuscitate the ghosts of Borley. 

C. E. M. HANSEL 


Not Quite Like Home: Small Hostels for Alcoholics and Others. By Shirley Otto & Jim Orford. 
Chichester: Wiley. 1978. Pp. xiii 4-218. £7.95. 


Traditionally, one of the main purposes of the hostel has been to provide food, shelter and, perhaps, 
some sort of home-life for the homeless and rootless. Given the behavioural problems of many 
residents, hostels have also tended to specialize not only in the care but also in the treatment of 
particular groups of clients. As a result their objectives have often seemed confused and contradictory, 
Sinclair (1971) remarked of one type ‘...probation hostels have been seen as temporary homes, 
short-term training institutions, therapeutic communities and families’. Professional and official 
interest in the hostel setting as a potentially cheap form of ‘community based’ residential programme 
for the treatment of socially undesired behaviour has doubtless fostered the recent growth 1n the 
hostel's popularity as an alternative or adjunct to other forms of residential intervention. 

Otto & Orford's book is primarily about the role of ‘small residential communities’ in the care and 
rehabilitation of alcoholics. It 1s worth saying right at the start that 1ts conclusions are unlikely to give 
much encouragement to those working in the field; but in many ways it provides a timely and useful 
antidote to the unrealistic expectations which tend to be generated by any novel treatment proposal. 
Part One (four chapters) gives an overview of small hostels (including those for alcohohcs) and their 
problems. Àn opening chapter discusses definitions and distinguishing characteristics of small hostels, 
and traces the growth of the hostel movement to the reaction against the problems faced and created 
by larger residential institutions The development of halfway houses and hostels for alcoholics is then 
outlined, and the authors conclude by talking about the results of their small survey of al] such 
hostels in the London area. This provides basic information about size and structure, staffing, hostel 
programmes and attitudes to residents’ drinking. Chapter 2 looks at the aims and methods of small 
hostels: the authors suggest that hostels can be described as operating in accordance with a 'social 
influence model', their aims being either to socialize their residents through social control and/or to 
improve their skills and resources through social support. Of primary importance to the achievement 
of such aims, it 1s suggested, are the cohesiveness of the resident group and the social climate of the 
hostel, and Otto & Orford outline some of the factors which affect these parameters. Chapter 3 draws 
on the results of previous research to discuss varieties of hostel organization and, in particular, the 
findings of the authors’ survey of residents’ decision-making practices in the eight London hostels. 

In the last chapter of Part One the authors outline some of the problems which currently hamper 
the development of hostels as effective treatment milieux. In addition, although the study as a whole 
seems somewhat preoccupied with treatment processes at the expense of outcome evaluation, they 
provide a good discussion of the problems of trying to assess the long-term effectiveness of hostels. 
They conclude that despite the difficulties faced by alcoholism hostels (unclear aims, uncertainty 
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about controls over drinking, unsatisfactory length of stay and leaving patterns, the hostel's function 
as a station on the 'rehab. circuit"), the part that these small institutions at present play in the care 
and rehabilitation of alcoholics can still be justified. 

Although Otto & Orford's book is in two parts, it really consists of three distinct books, each 
threatening the coherence and integrity of the final product. The organization of Part One, which 
constantly juxtaposes the development and organization of alcoholism hostels against those of the 
broader hostel movement, just fails to maintain a balance between general and particular which 
would have made it either a book about hostels or one about small hostels for alcoholics. The result 
is a lively, thorough and thought-provoking treatment of a mass of new material, but within a 
framework which sometimes makes for repetitious and confusing reading. 

Part Two, again, is yet another potential book, not only because the authors have failed fully to 
integrate Otto's intensive 72-week study of two small alcoholism hostels within the discussion in Part 
One, but also because the effects of its results 1s to question and undermine the earlier tones of 
qualified optimism. The study, which makes no claims to be a controlled trial, compares the 
operation of two hostel programmes and resident groups. Background details of the hostels (one of 
which was fully described by Cook, 1975) and the research are given in ch. 5. The remaining chapters 
compare various aspects of the living and treatment experiences in the two houses. These include the 
referral and selection of residents (hostel B's residents seemed closely to approximate the stereotype of 
the vagrant alcoholic); staff (n — 4) characteristics; house routines (especially the work habits and 
finances of residents, analysis of group meetings, staff-resident contact, house conduct, resident 
involvement in rule-enforcement); measures of the stability of the resident group, and their and staff's 
perceptions of the hostel environment. Further chapters provide short narrative accounts of events in 
the two houses over a 10-month period, and details of their short-term effects. All these chapters 
conclude with useful, albeit repetitive, summaries of similarities and differences between the two 
hostel environments. These findings, together with the final chapter of recommendations, are sure to 
be of great interest (and moral support) to those involved in the running of hostels. 

They are also likely to provide a salutary note of caution to policy-makers. Many of the results 
echo those from research on other — especially ‘new’ — forms of residential or community-based 
treatment. Thus, selection procedures and other events meant that only 25-30 per cent of referrals 
were admitted, major reasons for rejection (apart from candidates just failing to turn up) being 
perceived lack of motivation and difficulty in integrating with the existing resident group. As in many 
another small residential institution, rapid turnover of clients and low numbers of longstaying ones 
made it difficult to establish and maintain a core culture-carrying group of residents. In consequence 
the hostels were frequently subject to quite severe oscillations in stability and atmosphere. As the 
authors comment: ‘The inherent instability of such a system, and the difficulty of creating a group of 
beneficial influence, are likely to remain cardinal features of small alcoholism hostels.' Lastly, 40—50 
per cent of residents left within the first month and only a minority (20—30 per cent) stayed for a 
minimum ‘ideal’ period. Of all departures, only 10-14 per cent left in what the authors describe as a 
planned and orderly way. As for the longer term: ‘After leaving House A or B the majority of 
ex-residents continued to make the rounds of hospitals, courts, prisons, reception centres, lodging 
houses, and hostels large and small, as they had done before their admission' (p. 189). 

Unfortunately, Wiley's manuscript reproduction method makes this book a physical chore to read, 
even if the small skinny type and very basic layout do allow the maximum amount of text to be 
packed into the minimum number of pages. It would be a pity if this were to deter those with an 
interest in the treatment of alcoholism and the dynamics of small hostels from reading Otto & 
Orford's balanced assessment of the present performance of these institutions, and their prospects. 
DEREK CORNISH 
Cook, T. (1975). Vagrant Alcoholics. Henley-on-Thames. Routledge & Kegan Paul. 

Smvciair, I A.C (1971). Hostels for Probationers. London: HMSO. 


Introduction to Statistics: A Nonparametric Approach for the Social Sciences. By C. Leach. London: 
Wiley. 1979. Pp. 356 Cloth, £15.00; paper, £5.25. 

Distribution-free Methods for Non-parametric Problems: A Classified and Selected Bibliography. By 
B. Singer. Leicester: The Bntish Psychological Society. 1979. Pp. 72. £5.50. 


Chris Leach’s book sets out to be ‘an introductory textbook for students with no previous 
background in statistics’ and also ‘a handbook for researchers with little formal training in statistics’. 
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There is a 48-page introductory chapter covering descriptive statistics, data types, very elementary 
probability, and the basic ideas of null hypothesis testing. The remaining chapters cover a large 
selection of tests (Mann-Whitney, Wilcoxon, Fisher Exact, Binomial, McNemar, Gart, 
Kruskal-Wallis, Kendall S, Jonckheere, Spearman p, Friedman, Cochran, Chi-Square); measures of 
association in ordered and unordered contingency tables (8, #, Tp, Ag, A, $°, C) are discussed; there is 
advice on multiple comparisons; and there is a brief section on measuring agreement between 
observers, using Cohen's y. Tests of the Kolmogorov-Smirnov type are omitted. There is no 
discussion of the partitioning of y*, and nothing on multivariate techniques. 

The book has many excellent features Explanations are clear and careful throughout; look, for 
example, at the discussion of the properties of ô, y, Tẹ in 5.6. Tables are coherent ın layout (all are 
critical value tables), and completely understandable without backward references to the text. Tests 
which amount to special cases of Kendall’s S (e.g. the Mann-Whitney and Fisher Exact tests) are 
explicitly treated as such, and S is used as their test statistic. Effect-size estimators are described, 
where simple ones are available; for example, the Hodges-Lehmann estimators for the two-sample 
cases, and 8, y for anything which can be put in ordered contingency table form. Small n examples of 
null hypothesis distributions are derived from the null hypothesis definition as tests are introduced. In 
short, the book makes available some of the material 1n Bradley (1968), Hollander & Wolfe (1973), 
Goodman & Kruskal (1972) while remaining readily comprehensible to most psychology 
undergraduates It seems to me clearly superior to the much used book by Siegel (1956), which it 
deserves to replace as the standard undergraduate text on the subject. 

It is, nonetheless, a great pity that details of the various estimators' standard errors are not given 
Confidence intervals remind the researcher that significant results may often be substantively trivial, 
and non-significant ones inconclusive (a reminder all the more important for non-parametric tests, 
which do not in general allow easy power calculations). It is true that some relevant references are 
given, but they may not be easily understood by most readers of this book. For example, can the 
‘researcher with little formal training in statistics’ be expected to struggle (successfully) with 
Goodman & Kruskal (1972) in order to work out a confidence interval for 4? 

A more fundamental criticism is that the book, as almost all other elementary statistics books, 
presents statistics as if it were a body of coherent technical knowledge, like the principles of 
oscilloscope operation. In fact statistics is a collection of warring factions, with deep disagreements 
over fundamentals, and it seems dishonest not to point this out. Elementary statistics books conspire 
to produce psychologists who are able to do five-way analysis of variance while remaining incapable 
of coherent discussion on the problems associated with Bayes' theorem or with the attempt to delimit 
the applicability of probability theory. However little of the problems can be got across at an 
elementary level, there is surely an obligation to point out that they exist. 

Bernard Singer's classified bibliography of papers and books on distribution-free methods is 
reprinted from the British Journal of Mathematical and Statistical Psychology, 1979, 32, with the 
addition of an author index. Its starting-point is 1974 for contingency table references and 1961 for 
everything else, though a few earlier classics are included; it thus takes over from the earlier reviews 
by Savage (1962) and Killion & Zahn (1976). There is a brief introductory section, including some 
comments on popular misconceptions about distribution-free methods, and a survey of the historical 
connections between psychology and statistics. The main bibliography is classified into 13 main 
sections, many of them further subdivided, and there 1s an excellent cross-referencing scheme which 
indicates when, as often, items belong to more than one section; in addition information is given at 
the beginning of the main sections about which other sections are likely to include relevant material. 
Psychologists using this book should have no difficulty in finding material relevant to their problems; 
they may well need patience in sifting the mass of material found. One important effect of the book is 
to draw attention to the many well-developed distribution-free methods which appear hardly at all in 
the psychology journals: robust estimation, confidence intervals, predictive measures of association, 
log-linear models for contingency tables, multivariate methods, sequential methods, multiple 
comparisons. This book is needed and will be immensely valuable. 


A, E. DUSOIR 
BRADLEY, J. V. (1968). Distribution-free Statistical Simplification of asymptotic variances. Journal of the 
Tests Englewood Cliffs, N J.. Prentice-Hall. American Statistical Association, 67, 415-421. 


GOODMAN, L. A. & KRUsKAL, W. H. (1972). Measures HOoLLANDER, M. & Woxre, D. A. (1973) Nonparametric 
of association for cross-classifications. IV Statistical Methods New York Wiley. 
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KiLLION, R A & Zann, D (1976). A bibliography of Statistics. Cambridge, Mass.’ Harvard University 
contingency table literature’ 1900-1974. International Press. i 
Statistical Review, 44, 71-112. SIEGEL, S. (1956). Nonparametric Statistics for the 
SavaGE, I. R (1962). Bibliography of Nonparametric Behavioral Sciences. New York. McGraw-Hill 


Children's Learning and Attention Problems. By M. Kinsbourne & P. J. tapa Boston, Mass.. 
Little, Brown. 1979. Pp. xi 4- 300. 


This book is intended primarily for physicians, but also for those in other EE in the USA 
who are faced with the assessment and management of cases of learning disability in school children. 
It comprises an analysis of disability with recommendations for practice, and must be read with the 
intended audience in mind. It follows that readers need to ask how far the thinking behind the 
discussion relates to their own, and that those outside the USA need constantly to question the 
implication of working within different patterns of professional involvement. 

The analysis of learning disability is clearly presented, and attention is brought to bear on 
difficulties experienced by children who are judged not to be generally retarded, nor underachieving 
because of some deficiency in motivation. Such children are said to have selected cognitive 
disabilities. These are in turn seen as either cognitive power disability, where. the child experiences 
trouble in understanding or learning in a particular part of the curriculum, and is effectively 
selectively retarded, and cognitive style disability which covers maladaptive ways of going about 
learning by showing either impulsive or compulsive behaviour. The impulsive is the easily distractible, 
hyperactive child, while the compulsive is the child who focuses too intensively or over-long on a 
particular task. 

In the first chapter, in addition to establishing these hypothetical categories of disability, the 
authors also state their view that problems are best seen as variations within the normal rather than 
as the result of pathological states. They believe that the key to helping children is good teaching 
rather than the application of medical or paramedical technology. But when they come to review the 
personnel involved in intervention, this point seems to elude them. Consideration of the number of 
‘experts’ who may be involved in any particular case leads them to argue that the learning clinic 
personnel would be wasting their time by visiting schools to see children in their natural learning 
context. This amounts to a firm statement of belief that diagnostic interviews and tests in clinics and 
hospitals are a firm foundation for recommendations for educational treatment. We thus find the 
authors speaking of children who are referred because they have educational difficulties, but paying 
only lip-service to questions of the interpretation of assessment to teachers, and taking no account of 
the contexts in which the problems arise nor in which treatment will be attempted. This approach is 
not only a feature of the first chapter — it characterizes the whole book. In fact we find on p. 155 the 
following astonishing statement. ' Particularly if good rapport is maintained with the local school 
authorities, it is often possible to report the outcome of a learning disability evaluation to the relevant 
school personnel in writing, and to expect that the report will be studied with care and heeded if 
possible. But there are some occasions when such routine reporting has to be supplemented by a 
direct meeting with relevant school personnel to safe-guard the child's interests.' This is followed by a 
list of conditions which imply a knowledge of school policies, particularly as they relate to the 
individual child How this can be obtained without visiting the school, and seeing the child and 
teachers at work there, ıs not considered. 

The limitations inherent in applying a clinical model to the diagnosis and treatment of educational 
problems are evident throughout the discussion of both the cognitive power and the cognitive style 
disorders. Occasionally it is indicated that selective cognitive power disability might arise from some 
deficiency in educational experience, but the vast weight of the discussion of the approach to such 
cases rests on supposed deficiencies in the child. The reader 1s warned not to base prognosis on 
supposed brain-states, but the text does not always clearly distinguish between impairment of the 
brain, of brain development, and of cognitive development. Similarly, reference to ‘areas of function’ 
seems to slip between the cognitive and the neurophysiological domains. 

More than half the book 1s taken up with discussion of cognitive power disability against this 
background, and the lack of reference to an educational model means that it 1s almost entirely 
devoted to guiding the reader through diagnosis and prediction by means of discussion of clinical 
interviews and the use of standardized tests. In this country at least, knowledge derived from such 
procedures 1s likely to be of limited help to teachers, though it may be more meaningful within the 
American grade school system. Two points which this reader felt were generally valuable were the 
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recommendation that the child be told what is going on in interviewing and testing, and the reminder 
that prediction of school performance is more valuable for the near than for the distant future. Many 
would agree that the most useful procedures are to check on suspected deficiencies just before entry 
to school, and on cases when actual difficulty occurs within school. 

The section on cognitive style disorders is relatively short, and within it most attention is paid to 
the impulsive style. Here the authors seems concerned to press their own case for the use of drugs in 
the control of hyperactivity. They refer-to some of their own experimental work showing the 
attention-normalizing effects of stimulants in certain learning tasks, and the importance of 
maintaining this effect for consolidating learning. After a brief discussion of the problems of aetiology 
and diagnosis, their conclusion favours medication treatment, but on the basis of prior testing on 
each individual. Problems of treating symptoms rather than the condition itself are rather 
persuasively dismissed by treating these symptoms as personality traits or temperaments. Other 
treatments, such as behavioural, for hyperactivity are treated very slightly indeed. As for compulsive 
cognitive style disabilities the authors have very little to say. It would seem that although this 
hypothesized condition is to be found in children, relatively few cases are referred for help. Very 
surprisingly the book ends abruptly with no further discussion. 

I cannot find any good reasons for recommending this book. The repetitive, staccato style is hard 
to read; the professionals who face children with learning disabilities are likely to find the advice 
patronizing and limited; while any reader who wishes to become really informed about the nature 
and treatment of learning difficulties will find comparatively little in the text and no references to 
support or extend it. 

HAZEL FRANCIS 


‘Time’ in the Production and the Perception of Speech. Edited by W. J. Barry & K. J. Kohler. 
Arbeitsberichte no. 12. Institute of Phonetics, University of Kiel. 1979. Pp. x +327. 


This book is for addicts only. It is the proceedings (in English) of an interdisciplinary colloquium 
organized by the Phonetics Department of Kiel (West Germany) University, who see the symposium 
as ‘an exposition of the scientific stand of the Kiel phonetics department’ to show that they are part 
of the move away from classical phonetics. Modern phoneticians are increasingly concerned with the 
dimension of time. They reject the classical approach that relied for its data on a transcription of 
speech into a string of atemporal symbols. It is now technically straightforward to study the dynamic 
as well as the static aspects of speech articulation, both directly by means of a multitude of ingenious 
gadgets for making accessible the movements of the articulators (including a computer-controlled 
X-ray micro-beam that can track pellets attached in the mouth), and indirectly through computer 
simulation and synthesis. Perceptually too there i is an increased concern with temporal factors in 
natural, connected speech rather than the previous preoccupation with isolated syllables. 

The title of the symposium, then, promises well. But sadly rather little of the current excitement 1n 
the area is captured. With a small number of exceptions the papers contributed to the symposium are 
prosaic. The exceptions are a useful theoretical position and review of ‘The time course of speech 
perception’ by Sieb Nooteboom, a prospective look at speech production by James Lubker, some 
careful experiments (and a lot of neo-Gibsonian proselytizing) from Quentin Summerfield on the 
effect of rate of speech on the perception of voicing, and a presentation of some solid data from 
German and French on durations of voiced and voiceless stops and their preceding vowels by Klaus 
Kohler, one of the meeting's organizers. . 

Just over a third of the book is an under-edited transcription of the discussion following each 
paper. Some of the comments are worth preserving and clarify their accompanying papers, but the 
bulk should have been removed, being either trivial (two of the participants admitted to having had 
their tonsils out), repetitious or merely curious (Norma Baker stopped stammering when she became 
Marilyn Monroe). 

What of the issues? The central one is how to get from discrete, static and context-free linguistic 
units (such as features or phones) to a dynamic, continuous and context-adjusted speech production, 
and back again. Some feel that at our present level of ignorance it is better to be unprejudiced by 
linguistic units and study articulatory movement independently as part of the general problem of the 
control of coordinated movement. Certainly we need to know more at the level of movement control 
in general, and the ideas thrown out by action metatheory provide a challenge to speech workers to 
graduate from the more simplistic translation theories which have dominated speech research so far. 
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But it is dangerous in speech research to ignore the communicative function — what it is that the 
movement is trying to achieve. For instance, while it is sensible to talk about a simple articulatory 
goal in the articulation of bilabial stops, where the two lips must meet to close off the oral vocal 
tract, a more abstract goal is needed for vowel articulation. Speakers will, without the benefit of any 
auditory feedback, adjust the shape of their tongue, when their jaw position is constrained by a 
bite-block, in order to maintain a roughly constant vocal-tract area function. This more abstract 
articulatory goal in turn becomes inadequate for the case where subjects, prevented from rounding 
their lips, lower their larynx in compensation. The compensation does not produce a constant 
vocal-tract area function but it does tend to produce a more constant acoustic output since both 
manoeuvres have the effect of lowering formants. This intimate link between production and 
perception has been repeatedly stressed by workers 1n perception and translated into newtalk by the 
neo-Gibsonians, but that should not prevent the unconverted heeding it too. 

C. J. DARWIN 


The Development of Memory in Children. By R. Kail. Reading: W. H. Freeman. 1979. Pp. 168. 
Cased, £7.40; paper £3.40. 


Robert Kail has written an introduction to the development of memory which 1s a model of 
unpretentiousness and clarity. He has a talent for picking illustrative studies and describing their 
essentials in plain, concrete English. The book covers material which has been described until now, 
either in journal form or in relatively advanced chapters, directed largely at a research audience. This 
is the first book-length study of the development of memory which 1s clearly intended for students, 
particularly introductory students, and, given that goal, it communicates very effectively. 

After a brief introduction containing a few illustrations of everyday situations involving memory, 
Kail describes the development of mnemonic strategies such as rehearsal, the child's understanding of 
the psychology of memory (so-called metamemory), the relatively small age changes 1n recognition 
memory, the impact of knowledge on memory, individual differences, and finally the effect of 
mnemonic constraints on problem-solving and thinking. With a few exceptions, the book comes to a 
reliable and lucid assessment of what we know and do not know. The only major exception in my 
view is a rather glib review of memory for prose, in which some of the profound stabilities across age 
are ignored while questionable assertions are made about the older child's greater inferential abilities. 

Perhaps the only weakness of the book is, from a certain point of view, its strength. By being up to 
date, experimentally oriented and 95 per cent American, the book has an air of topicality which is a 
bit myopic. In particular, the book lacks any historical perspective. Everything seems to have just 
been wntten or to be ın press. There is no discussion, for example, of the traditional Russian 
emphasis on the way in which memorizing 1s largely a by-product of activities directed at less 
intellectual goals. There is no hint that the striking increase in the spontaneous use of rehearsal with 
age is absent in non-Western cultures. Finally, by keeping his nose pressed firmly against the 
contemporary journals, Kail has neglected most of the issues that those journals neglect. To take one 
example, introspective evidence and experimental findings (see, for example, Brown & Kulik, 1977, 
on memory for the news of Kennedy's assassination) indicate that whether or not we remember an 
episode has something to do with the emotion that we felt at the time. Yet emotion is not mentioned 
once in the book. Only once does Kail allow himself to stray from the well-trodden contemporary 
path, in an'all too brief discussion of the memory skills of idiots savants. 

In short, students who read Kail's book will be well equipped to read what they find in today's 
journals — if that is what you want. 

P. L. HARRIS 


Brown, R & Kurrk, J. (1977) Flashbulb memories. Cognition, 5, 73-99. 


Sensation and Perception. By S. Coren, C. Porac & L. M. Ward. New York: Academic Press. 1979. 
Pp. 439+ appendices. 


In spite of its length this 1s not a book for the specialist; rather it 1s intended as an introductory text 
for students, particularly American students, and the style of writing reflects this. Each new idea or 
phenomenon is developed slowly, wherever possible with examples from everyday life. This makes for 
easy and fairly entertaining reading, but at the same time means that the information is thinly spread; 
the ratio of time spent reading to new concepts gained is rather high in relation to texts written with 
British students in mind. 

The book contains chapters on most of the topics one expects to find in a book on sensation and 
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perception (detection and discrimination, sensory scaling, anatomy and physiology of the various 
sensory systems, basic attributes of sensory stimuli, the perception of space, form, time and motion, 
perceptual constancies) and in addition contains three chapters which are a little unusual: attention 
and search, learning and development, and individual differences. These three chapters form a useful 
link between the study of perception and the study of ‘cognitive processes’ and this should be of 
considerable value to students, since a common complaint is that the contents of psychology courses 
and the teaching of psychology are too fragmented. 

: One rather nice feature of the book is a series of 80 demonstration boxes, which describe simple 
demonstrations which can be tried out by students using readily available materials. Most of these 
are well thought out, and should be effective, and several are quite ingenious. It is a feature which 
other authors might well consider adopting. 

The book is most successful in those areas where the authors have specialist skills, notably in visual 
perception and visual illusions. The material for other areas, such as hearing or the chemical senses, 
seems to have been derived largely from secondary sources, such as other textbooks. Sometimes this 
has led to a rather old-fashioned emphasis, and in several cases data are presented from early research 
which are not supported by more recent experiments. For example, the well-known experiment of 
S. S. Stephens, showing large shifts in pitch for pure tones as a function of intensity, is presented 
without any mention of the fact that more recent work has failed to reveal such large effects. The 
discussion of the pitch of complex tones is also misleading. It is well known that the pitch of a 
complex tone is normally equal to (or almost equal to) the pitch of the fundamental component, but 
it ıs quite wrong to say, as they do, that ‘the pitch of a sound is greatly determined by the frequency 
of the fundamental’. A further mistake arises in discussing the ‘phenomenon of the missing 
fundamental’. They state that for two complex tones of the same repetition rate, one with a 
fundamental component and one without, ‘both of the sounds are subjectively the same’, and that 
this is an ‘illusion’. In fact, we can hear a difference when the fundamental 1s removed, but the pitch 
of the sound does not alter. The most reasonable interpretation of this is that the pitch of a complex 
sound is not, in general, determined by its fundamental component, so that no ‘illusion’ is involved. 

In spite of these criticisms this 1s for the most part a useful and clearly written book. I would be 
happy to recommend it to my students, but as supplementary reading, rather than as the main text to 
accompany a course. 

BRIAN C. J. MOORE 


The Physiological Approach in Psychology.. By C. F. Levinthal. Englewood Cliffs, N.J.: Prentice-Hall. 
1979. Pp. 466. £11.00. 


There are many physiological psychology textbooks aimed at the undergraduate market. The rate of 
production has increased over recent years, and the problem of discriminating between them has 
grown accordingly. 

With few exceptions their format i is standardized; they are sectioned into neuronal structure and 
function, neuro-anatomical organization, sensory systems, motor systems, hemisphere function, 
emotion, motivation, learning and memory, etc. Levinthal follows the same patttern. 

Books may vary, generally speaking, in terms either of information or organization; rarely both. 
Asit is currently gripped by a dynamic empiricism, physiological psychology lends itself more readily 
to information rather than to organization, and textbooks reflect this; they may be differentiated in 
terms of topicality, and here Levinthal has a slight edge over the competition, exemplified by mention 
of enkephalin and the opiate receptor. 

However, a substantial core of the subject is standard and relatively unchanging - the neuron, 
organization of the nervous system, structure of sensory and motor pathways, etc. It is difficult to see 
how these could be presented originally, and Levinthal does not try, sensibly using, with 
acknowledgements, many figures from other recent textbooks. 

Given this common core, are there distinctive features in Levinthal which, ‘alongside its topicality, 
make it a better, investment than others in the field? These features would, if present, be found within 
the organization of the book; but here we find reiterated the standard organization and its associated 
problems. It may be that integrative concepts do not exist in physiological psychology, to be carried 
over from chapter to chapter, blurring the margins between apparently discrete topics; it may be that 
advances in cognitive psychology: have left the physiological psychologist behind, wrestling with 
‘boxes’ while the cognitive pacnoloey embraces ‘levels’ of processing. It may be, but on the other 
hand it may not. 

Levinthal has a chapter on brain chemistry ad drugs, incorporating the pharmacotherapy of 
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schizophrenia, depression and anxiety. In a later chapter there 1s reference to the possible 
involvement of chemical pathways in the ventro-medical and lateral hypothalamic syndromes, and in 
a still later chapter there is a description of Old’s original work on self-stimulation and reward 
pathways. These three topics are not independent. They reflect a view of chemical pathways as 
representing functional units in drive, motivation, reward, i.e. in behaviour — a view that has found 
expression in the most dramatic expansion of brain research over the last two decades, and to which 
lip-service, at least, should be paid. 

Dotted around chs 13 (Learning and reward) and 14 (Remembering) is the hippocampus. Clinical 
data give us, unusually, a reasonably clear picture of what human functions the hippocampus may be 
involved in; hippocampal damage can produce amnesia. Hippocampal damage in monkey and in rat 
does not, according to Levinthal, produce amnesia. Why not? The question is not trivial, as 
comparative neuropsychological data are the foundation of extrapolation, and thus of physiological 
psychology. A student at any level should be made aware of the problem and its importance, as they 
should also of the wider issues; a study in which amygdalectomized monkeys are returned to the wild 
and allowed to starve to death (Levinthal, ch. 12) is perhaps deserving of some comment beyond 
mere description, as any interested reader would benefit from its reasoned justification. 

The frustrating feature of most textbooks, including Levinthal, is that the material for a more 
integrated and interesting organization 1s present, but the ‘compartmentalized topic’ approach has 
become institutionalized. Unfortunately, 1t does lend itself readily to a series of lectures, with chapter 
headings transmogrified into lecture titles. 

The Physiological Approach in Psychology bears comparison with most other books 1n the field. It 
is clearly, if unimaginatively, written, and the figures are excellent. At the end of each chapter there is 
a short but useful list of further readings, and a reasonably exhaustive glossary at the end of the book 
reminds you where the flocculonodular lobe is. As a hardback it is quite expensive; given the 
problems of differentiating significantly the content and organization of physiological psychology 
texts, price becomes an important factor, and here the soft-cover editions of recent favourites score 
highly. When Levinthal emerges softly, it will, as the most recent, become recommendable in a field 
where recency has been made to count. 

SIMON GREEN 


A Program for Families of Children with Learning and Behaviour Problems. By Martin A. Kozloff. 
Chichester. Wiley. 1979. Pp. 450. £14 


An experimental psychologist once said ‘If you want to know about child rearing, ask your 
grandmother — don't consult a psychologist’. The trouble, of course, could be either absence of a 
grandmother, an unwise ancestor, or a lady whose wisdom was unanalysed and not communicable by 
her. If you want to know how to help parents rear deviant children, Martin Kozloff invites you to 
consult him; he has worked with the problem for over 10 years, read widely and analysed extensively, 
and he communicates in strong, clear language. 

He starts with a brief review of studies in training parents in behaviour modifications and rightly 
concludes that for the majority the programs were ineffective either in the short or long term. Since 
merely leaving parents to cope unaided with their autistic, aggressive, mentally retarded or 
language-disordered offspring was unacceptable, the author considered what might have gone wrong, 
and also wrote a book, Educating Children with Learning and Behaviour Problems (1974), which is 
recommended as a companion volume. 

Dr Kozloff's program developed from his early work with four families with autistic children. 
Their parents had been taught the basic principles of learning and social exchange in order to 
accelerate development in the home. However, the author considered the range of skills was too 
narrow, his clients lacked techniques for evaluating their children's strengths and needs, as well as 
their own; they were unable to develop alternative teaching programs or to solve new problems. Even 
his earlier book did not seem sufficient. Since the process of therapeutic change was apparently 
affected by many complex factors, both within and outside the family, a broader approach than 
traditional behaviour therapy seemed necessary. To equip himself better the author consulted works 
on communication theory, family therapy, sociology, women's studies, political and social 
philosophy, and psychodynamic psychotherapy. He also analysed large amounts of case-study 
material, written and taped, and put it all together in a highly structured program In his own words 
this book, in step-by-step fashion, ‘focuses on: (1) first contacts with families; (2) conducting initial 
interviews; (3) strengthening readiness to change; (4) formalizing a working relationship; (5) assessing 
the strengths and needs of the family system — via interviews, direct observation, and an Assessment 
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and Programming Guide; (6) establishing short-term and long-term goals; (7) planning a program; 

(8) conducting programs for individual famulies or for groups — which includes teaching parents a wide 
range of skills for evaluating the family system and their children’s needs, planning an educational 
program, managing productive patterns of interaction, conducting home teaching programs, 
problem-solving and working to maintain and generalize beneficial change; and (9) conducting 
ongoing and summative evaluations of the program, revising it as necessary.’ 

One must hope that this intrepid seeker has got it right this time. However, some problems remain. 
Not only does the book suggest how parents and their children can be programmed but the therapist 
as well. There are such detailed prescriptions as ‘present yourself as empathetic, competent and 
businesslike. .'.; ‘At each step, you are setting the occasion for the parents to emit behaviors (the 
components of readiness) which you can reinforce or confirm. As the parents emit readiness 
behaviors at one step, they become more likely to emit readiness behaviors at the next...’; ‘When 
specific problem behaviors and behavioral inadequacies of their child are described, be empathetic 
(but not sickeningly so). Lean forward, nod your head and say. Mm hmm”, “I understand", or 
“That behavior must be hard to live with” to indicate that you are following them and that you 
understand what they have been going through." 

The publisher's announcement, supported by the author's statements, commends the book to 
teachers, social workers and parents as well as psychologists. One wonders how parents will take to 
it, and whether a professional or para-professional who requires as much advice on the minutiae of 
social interaction should ever be let loose on a family. Despite the lack of differentiation between 
families whose children present widely different disorders, the book does contain some informed 
advice and an analysis of parents’ various problems which few grandmothers would ever have 
encountered. One must await some proper evaluation of this program. 

ANN M. CLARKE j 


Psychiatric Symptoms and Cognitive Loss in the Elderly. Evaluation and Assessment Techniques. By 
A. Raskin &'L. F; Jarvik. Chichester: Wiley. 1979. 


The book consists of a collection of papers presented at a workshop in 1977. The first part deals with 
methods of assessing psychopathology in old age, focusing mainly on depression and anxiety. The 
second part is concerned with methods of assessing cognitive loss in old age. What the book has to 
offer is primarily a guide for the selection of the most appropriate ‘evaluation procedures, and as such 
it could be extremely useful to the clinician or the researcher. It is a methodological handbook. For 
the general reader its interest is limited. Lists of different rating scales and tests, followed by brief 
comments on their relative advantages and disadvantages, make dull and often repetitive reading. 
Comparisons between different methods are often vitiated by procedural and sampling differences so 
that it is not always possible to determine which is preferable. There is little theoretical framework, 
and evaluation procedures appear to be generated by, and judged on, empirical considerations, so the 
general reader does not gain much insight into the nature or aetiology of psychopathology in old age. 

Throughout the book contributors are scrupulous and perceptive in pointing out the difficulties 
and shortcomings of available procedures, and offer sensible recommendations for remedying those 
that are remediable. Many of the problems associated with assessment of the elderly are not readily 
soluble, but need to be acknowledged, and conclusions qualified accordingly. One pervasive problem 
is the lack of normative data against which to assess impairment: another is the confounding of age 
itself with other age-related factors. The point is well made that the value of a test depends on its 
goal. So, for example, it may be difficult to assess the effects of ageing on the incidence of depression 
because old and young people differ in the way they respond to questions; old people may be more 
prone to suppress or deny symptoms; they give different weight to physical, affective and behavioural 
manifestations; and the effects of ageing are confounded with physical ill health, medication, 
nutrition, cognitive and sensory impairment and environmental circumstances. Yet a rating scale 
which is powerless to disentangle such factors may be quite adequate to monitor changes within an 
individual over time, as a function of treatment. 

The chapter by Michalewski, Thompson & Patterson on the use of EEG techniques outlines a 
promising and rapidly developing method of assessment which has greater objectivity and sensitivity 
than rating scales. EEGs yield indices which appear to correlate quite reliably with age, behaviour, 
affect and biochemical state and to exhibit lawful changes. The technique obviously has great 
potential value for revealing the causal mechanisms underlying the effects of ageing, and for 
monitoring drug-induced changes. 

In the second part of the book, which reviews methods of testing for cognitive changes, 
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contributors are again at pains to point out the pitfalls of psychometric testing. In old age tests of 
specific cognitive abilities may be contaminated by general factors such as anxiety, lack of motivation, 
distractability, slowness, sensory deficits and illness; normative data may be lacking; age differences 
are confounded with cohort differences, and test-retest variability is high. Some helpful advice is 
given about ways of minimizing these problems. In the final chapter Cohen & Eisdorfer advocate the 
use of information-processing paradigms to test for cognitive impairment. They are undoubtedly 
correct in maintaining that information-processing techniques have much greater power than 
psychometric tests to identify precisely what components of the cognitive system are impaired and 
what components are intact, but in contrast with the general tenor of the rest of the book, they neglect 
to point out the difficulties. Some of the paradigms they suggest are unsuitable for testing old people. 
For example, old people may be unable to perform adequately in a partial report task because of 
sensory loss, reduced speed of processing, dislike of unfamiliar apparatus, and fatigue over a series of 
trials. Information-processing techniques are potentially more informative than' standard 
psychometric tests, but need to be carefully selected and adapted for use with the elderly. 

GILLIAN COHEN ' 


Human Development. By T. G. R. Bower. San Francisco: W. H. Freeman. 1979. 


Dr Bower has written three other books on developmental topics during the past few years. 
Development in Infancy (1974) was ın large part devoted to a presentation of Bower's extensive, 
distinctive and controversial investigations of perceptual, motor and cognitive development during 
infancy and early childhood. It served to bring together, within the bounds. of a single, compact 
volume, material which had previously been available only in the form of journal articles or had 
been unpublished. As such, it served an important purpose. Three years later A Primer of Infant 
Development (1977) appeared. This covered much the same ground, although it also contained 
chapters on social and language development, topics which had received little consideration in the 
previous book. It was followed, also in 1977, by The Perceptual World of the Child. There is a great 
deal of overlap between this volume and the two previous ones, though to some extent this is justified 
in that The Perceptual World of the Child was written for the informed lay person who may not have 
had prior exposure to the other works, directed as they were towards the academic community. 

Human Development could be fairly described as the omnibus edition of the three earlier volumes. 
More than half the book focuses upon infant development and, on this theme, there is little that has 
not previously appeared in either or both of Development in Infancy and A Primer of Infant 
Development. The topics, pictures and illustrations will be recognized by anyone familiar with these 
books. One 1s reminded of the story of Little Blue Riding Hood in which only the colour had been 
changed to prevent identification! However, the volume also has chapters on cognitive, social and 
personality development which extend beyond the infancy period into childhood and adolescence. 
Treatment of these topics is fairly conventional; the work of Piaget is extensively drawn upon in the 
discussion of cognitive development. 

In his preface, Dr Bower indicates that his aim in writing this new book was to present a 
theoretically coherent basic outline of human development in contrast to the catalogues of facts which 
he feels characterize the majority of developmental textbooks. The position he espouses and advocates 
most explicitly in the final chapter is described as differentiation theory. This theory argues that 
development is a process whereby the child 1s born with or very rapidly acquires abstract, high level 
frameworks for interpreting the world; from these abstract, general frameworks more specific 
concepts or rules are derived so that behaviour can adapt to specific situations. It is this, process of 
downward specification to lower level from higher level which is referred to as differentiation. Bower 
acknowledges that the notion of developmental differentiation by itself 1s not particularly 
unorthodox, though he fails to discuss the relationship between his version and others in the 
field — for example, the principle of orthogenesis advocated so long ago by Heinz Werner. What is 
unorthodox is the claim that higher levels precede lower levels of abstraction from the beginning of 
development. Unfortunately, it is this part of the theory which is least satisfactorily dealt with and it 
1s not always evident how we are to understand the argument that lower level concepts are derived 
from more abstract concepts rather than the latter being generalized from the former. 

Consider two examples which Bower uses to illustrate his argument: (i) ' Consider a baby in a 
simple conditioning situation. The child is again learning at least three things. Level 7. When I kick 
my foot the mobile goes round. Level 2. I can control events in the world. Level 3. The world is a 
consistent and orderly place' (p. 223). (ii) ' Consider a baby faced with a mobile that moved every 30 
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seconds regardless of whether or not the baby was doing anything. That baby would learn: Level 1. 
The mobile moves by itself every 30 seconds. Level 2. I cannot control events in the world. Level 3. 
The word is a consistent and orderly place’ (p. 224). Now, it is surely obvious that the necessary 
relationship between levels in these examples is that higher (levels 2 and 3) are generalized from lower 
(level 1) rather than lower being derived or differentiated from higher. In particular, these examples 
make it clear that the kind of level 2 concepts which arise will be a function of level 1 experience This 
is not to deny that the nature of level 2 concepts will be of critical importance for the quality of 
transfer of experience to new situations. Nor is ıt to deny that with further development the level 3 
type concepts which emerge may become the impelling motive to search for level 2 and 1 
applications. It is to assert, however, that the initial ontological sequence of acquisition 1s from lower 
to higher conceptual levels. Such an interpretation is in full accord with a learning-to-learn view of 
early development and the growth of competence, a view to which Bower appears to be not 
unsympathetic. 

A more tightly argued and more fully elaborated representation of the theory could have turned 
this into a most useful volume. As it stands, however, it adds only marginally to other recent 
publications by the author. 

HARRY McGURK 


Bower, T.G R. (1974). Development in Infancy. San Bower, T. G. R. (1977) The Perceptual World of the 
Francisco. W. H. Freeman. Child, London Fontana/Open Books. 

Bower, T G. R. (1977) A Primer of Infant 
Development San Francisco’ W H Freeman 


Philosophical Problems in Psychology. Edited by N. Bolton. London: Methuen. 1979. Pp. xiii -- 207. 
£8 50. 


Professor Bolton has assembled a number of papers by philosophers and psychologists with the aim, 
as he puts it, of bringing the two disciplines into closer relationship. That aim 1s only very partially 
realized, and the book is given a rather spurious unity. Its three parts are subtitled ‘Reason and 
action’, ‘The psychology of action’ and ‘The context of action’. It 1s perfectly true that in the first 
part two philosophers, Philip Pettit and Colin McGinn, do discuss the explanation of action from 
points of view which they have presented elsewhere — points of view inspired to one degree or another 
by Donald Davidson. Whatever the value of these expositions of the Davidsonian approach to the 
philosophy of action, their exhibitions of concern for psychology as a discipline are gestures only. 

The second part of the book has little to do with action at all. Wolfe Mays defends Piaget against 
some recent philosophical criticisms of him. He is not always successful. He seeks, for example, fo 
defend Piaget against a criticism of my own to the effect that Piaget offers an inadequate 
consideration of the social; but because he considers only one paper of mine, written 12 years ago, he 
fails to take account of the point that for Piaget the social can be only one part of the individual's 
environment and not an essential precondition for the very existence of knowledge. Michael Morgan 
considers and rejects a distinction between separate physical and phenomenal spaces. He does it 
interestingly, but it has to do with psychology in only a minimal way. N. E. Wetherick rhapsodizes 
on the foundations of psychology, considering alternatives to empiricism, such as Marxism and 
phenomenology. In the end he embraces Bhaskar's ‘transcendental realism’ (given a very un-Kantian 
sense), which when applied to psychology is supposed to reveal the object of its study as ‘a 
model-making structure’. What that comes to remains fundamentally unclear. By comparison, 
Marget Boden’s ‘The computational metaphor 1n psychology’ provides a modest and civilized 
account of what can be expected from artificial intelligence. It is one of the best papers on the subject 
that has been written by her — perhaps one of the best altogether. 

In the final section there are three papers which are difficult to characterize. Arthur Still seeks to 
defend a realistic view of perception, in particular Gibson’s, against constructionist views, which to 
his mind run into difficulty in explaining how the construction of a representation of reality is 
possible. The trouble is that it 1s not always easy to recognize the positions that he sets out in his 
discussion. It might have been better to devote more thought to the definition of the questions which 
are at stake The editor, Neil Bolton, defends the phenomenological approach against 
*psychologism'. What that involves is all too briefly set out and 1s ridden with the jargon beloved by 
phenomenologists. The final paper, by John H. Heaton, reaches the surprising conclusion that 
psychotherapy should not be based on psychology. ‘The crux of theoretical work 1n psychotherapy’, 
he says (p. 192), *is to confront the representative powers of language, to recognize its limits and 
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what can and cannot be said.’ It is a very obscure, and perhaps eccentric, piece, but there may be 
more to it than has been said. : 

The impression that one comes away with, perhaps unfairly, is the following. Two of the 
philosophers make gestures towards psychology, while two others speak almost as psychologists. The 
psychologists, on the other hand, give the impression of being would-be philosophers. I cannot think 
that this is the way to bring the disciplines into closer relationship. One might have hoped from the 
title of the book for a closer philosophical examination of certain psychological concepts. That 1s 
hardly what one gets. 

D. W. HAMLYN 


The Coming Age of Psychosomatics. By M. Carruthers & P. Mellett. Oxford: Pergamon. 1979. 


This is a distinctly less than average collection of papers, from the Annual Conference of the Society 
for Psychosomatic Research held in 1977. Perhaps it is the generality of the label ‘psychosomatic’ 
which causes the papers to be quite diverse in both content and quality. The topics range from birth 
trauma to death, from jogging to stage fright, from the pineal body to minimal brain damage, and 
from asthma to jet lag. 

It ıs this variety which would seem to give rise to the apparent attractiveness but ultimate weakness 
of such a collection since there 1s little consistent emphasis to attract the more committed reader or 
research worker. Exceptions to this are the two interesting cardiovascular papers by Obnst et al. and 
by Steptoe, although even these appear in two different sections. Indeed the organization of the eight 
sections is puzzling at times and not particularly helped by some of the florid section titles (e.g. Death 
2001; Yoga and yoghurt; Space-age programming — sexual and non-sexual cycles). 

Before embarking on specific comments, two general complaints should be aired. Firstly, of the 15 
papers listed on the contents page, only nine are fully reported since the other six consist of 
half-page abstracts. This appears to be doubly unfortunate as two of these six looked as if they might 
have been amongst the best papers at the conference. These are the papers (or abstracts) by Crow 
and by Lader which comprise the ‘Central psychopharmacology' section. Some of the other abstracts 
did not look as if they would have necessarily benefited from a fuller presentation, although I was 
curious to know what was said in the paper by Mackarness (‘Can the food we cat drive us mad?"). 

My second general complaint is that this conference report was originally put out in exactly the 
same form, but without the hard covers, as Vol. 22, no. 4, of the Journal for Psychosomatic Research 
(1978). No further editing has been attempted, and in view of the diverse contents, some linking 
editorals might have been valuable. Frankly I can't see too many people wanting to pay out £10.00 
for a couple of good papers and some promising abstracts when they can easily get hold of the back 
number of the journal. 

Turning to some of the contents, the first section of the book, entitled ' Birth 1984', contains a 
rather amazing paper by Lake on reliving birth trauma as a treatment for a wide range of 
neurological and psychological problems, and one by Mellet on the early origins of asthma. The 
former contains some very dramatic case examples and some extrapolations which can at best be 
described as speculative. Mellet, also in speculative vein, strings together some quite diverse evidence 
in suggesting that asthma may stem from the very earliest patterns of neonate respiration. 

Later on, Blythe unconvincingly attempts to explain emotional disorders 1n terms of minimal brain 
dysfunction in a paper which is as muddied as the concept of minimal brain dysfunction. The possible 
merits of jogging in the treatment of depression are explored by Greist and co-workers, who found it 
to be as effective as time-limited or time-unlimited psychotherapy. This either says a lot for jogging or 
not very much for psychotherapy but it turned out at the end of the paper that the psychotherapists 
were second-year psychiatry residents with quite limited experience and the running leader had seven 
years experience. Finally, Cartesians will be pleased to see that the pineal body is making a comeback 
in psychology. Pretty well ignored since Descartes thought of it as the seat of the mind and immortal 
soul, the pineal body is once again being investigated, primarily from an endocrinological viewpoint. 
The contribution by Mullen and no fewer than nine co-workers presents an interesting review of 
current work, together with a few findings showing that melatonin and 5-methoxytryptophol can 
both be found in man and show a clear circadian periodicity 

After reading through the book, one could either be charitable and call it a varied and occasionally 
stimulating collection of papers or more brutal and call it rather poor and haphazard with a few 
exceptions. I look to books like this one to catch up with the state of the art, particularly in this area 
which should be relevant to my own teaching, but I came away feeling distinctly dissatisfied. 

JOHN WEINMAN 
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Psychological Development from Infancy: Image to Intention. Edited by M. H. Bornstein & W. 
Kessen. Hillsdale, N.J.: Erlbaum. 1979. Pp. 404. £15.20. 


The focus of this book is on the transition from early infancy to childhood — from ‘birth to 
conversation’. What precursors to later cognitive and social development can be identified in the first 
year of life, and what kind of continuities are there between the early and later months? These were 
the central issues which the editors hoped that their contributors would address. While there is great 
variation between the chapters in the strategies used to approach these questions, and in the success 
attained, several themes do recur insistently throughout the book. The concern of most of the authors 
is to present a theoretical outline, rather than to discuss empirical evidence in detail; though the 
descriptive terms and the metaphors vary greatly from chapter to chapter, the framework is almost 
always a Piagetian one, and there is an emphasis not only on stages, and on the universality of the 
sequence of stages, but on discontinuity between stages. 

The book is organized around three topics — perceptual and motor development, cognitive 
development, and language and social development; a critical commentary on the chapters on each 
topic is included. This format works well, particularly in the section on perceptual and motor 
development, where Gibson’s discussion of the chapters by Kopp, Bornstein, Fagan and Pick is 
constructively critical, highlighting the important points and omissions 1n a clear and helpful fashion. 
These chapters contain much more detailed empirical material than the later chapters, and provide 
the reader with a useful review of issues in research on the development of perception, spatial 
reference and motor development. 

In the section on cognitive development all four contributiors attempt to construct theoretical 
frameworks on a grand scale. McCall presents an account which emphasizes the change in 
developmental functions and in individual differences over the first 3 years (commenting with care on 
the distinction between these). Some of his ideas, and the empirical data from infant mental tests on 
which he draws, have been published elsewhere, but it is an interesting account, and it is useful to 
have it in this context alongside that of Kagan, since the parallels and differences between their ideas 
can be seen clearly. While Kagan, for instance, stresses the importance of change in the ability of the 
infant to use memory as the salient feature of cognitive development on which advances such as the 
development of object permanence and separation protest depend, McCall argues that memory alone 
cannot fully explain such changes in social behaviour Mussen's sensible comments on these chapters, 
and on those of Elkind and Papousek, are made in very general terms. He cautiously congratulates 
the authors for attempting to build theories but admonishes McCall and Kagan for lack of clarity and 
rigour, and for a ‘too-rich’ use of metaphor. Much of this metaphor has a biological tone, and 
Mussen is surely right in protesting that references to biological ‘maturation’ provide little clear aid 
in explaining the transition from one cognitive stage to another. However the relationship of 
experience, and of social experience in particular, to cognitive development is hardly considered in 
these chapters. It is only in the chapters by Schaffer and Nelson, and the commentary by Mandler in 
the final section, that the crucial importance of studying development in the social context in which the 
infant grows up becomes clearly articulated. Schaffer's chapter provides an excellent summary of the 
important issues in recent work on social interaction between infant and caregiver, and the role that 
early interaction plays in the growth of communicative understanding. Precisely how this preverbal 
‘dialogue’ is related to the acquistion of language is still very unclear, and it is obviously important 
to recognize that in using the word 'dialogue' to describe early interaction we are making an analogy 
with later use of verbal dialogue, not explaining the origins of the latter. Nelson argues that the 
cognitive and social prerequisites for language are well developed by the first birthday. The fact that 
two-word speech does not appear until a year later reflects, she proposes, the importance and the 
difficulty of two developments which take place during the second year: the integration of 'social and 
object realms', and of two separate language functions, ideational and interpersonal. It is an 
interesting and provocative chapter and, in demonstrating the importance of the social context of 
development, the final chapters do go some way towards redressing the balance of the book, 
otherwise weighted towards the development of the child's understanding of the physical world. It 1s 
only in the final section of the book that the historical tradition of psychology as an academic 
subject, and the subject-matter which a valid psychology must in principle be capable of explaining, 
come at last into a clear confrontation. An uneven, but stimulating and lively book, well timed in its 
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Lateralization of functions in the vertebrate brain: A review 


S. F. Walker 





That the haman left and right cerebral hemispheres perform different functions is widely accepted, 
but there iz litle evidence of whether or not similar functional asymmetries exist in non-human 
vertebrates. In this paper, neuro-anatomical similarities between human and other vertebrate brains 
are considered, and data concerning physical asymmetries reviewed The defining features of human 
lateralization are taken to be mght-handedness, as a skewed but continuous distribution of 
preferences, and a greater involvement of the left hemisphere in species-specific vocalization, with 
nght-hemisphere superionty in spatial perception and emotionality less well-marked characteristics. 
Rodents, cats, at least one species of marsupial, and macaque monkeys have consistent hand 
preferences fo- food reaching. These may result from constitutional factors, but in every species 
studied the distribution of preferences is unskewed. Canaries appear to have left-hemisphere 
dominance of vocal production, and there 1s limited support for the conjecture that macaque 
monkeys have left-hemisphere dominance for reception of species-specific cries, and/or for short-term 
auditory memory. Left and right unilateral hemispheric damage may have appreciably different 
effects on emctionality in rats, sound localization in cats, and tactile discrimination in monkeys, 
although the zvailable evidence 1s equivocal. It seems possible that asymmetries of cerebral function 
are widespread in vertebrates In particular, left hemisphere dominance of species-specific 
communication might be common in birds and primates: left-hemisphere dominance of human 
speech mar be an example of a general vertebrate tendency towards unilateral control of vocalization. 





The assignment of different functions to the right and left hemisphere of the human brain is 
a crucial element of current neuropsychology (Ornstein, 1972; Popper & Eccles, 1977; 
Gazzaniga, 1979). The left hemisphere contains mechanisms responsible for speech, and is 
said to operete in a manner suitable for mental arithmetic and logical thought. These are 
important ckaracteristics, and the left hemisphere was formerly held to be ‘dominant’ over 
the *minor' right hemisphere. But more emphasis 1s now given to the minor hemisphere's 
own specializations — perception and expression of emotion; knowledge of spatial relations, 
especially in connection with visual input; and generally, operations that take place in a 
wholistic, global, or Gestalt fashion. 

The pu-pcse of this review 1s not to examine in detail the nature of these asymmetries 1n 
human brain function, but to consider the extent to which differences between the left and 
right hemispaere mark off man from all other vertebrates. There are various ways in which 
lateralization of function could be related to uniquely human capacities of the brain. For 
instance, ateralization would be secondary to language, 1f it could be shown that 
possession of language induces cerebral asymmetry, rather than vice versa. Any other 
asymmetrica ly represented process could conceivably be the driving force behind 
lateralization, but linguistic competence is clearly a strong candidate. Possible alternatives 
are right-nardedness (Hardyck & Petrinovitch, 1977) and the existence of a conscious self 
which communicates directly with only the left hemisphere (Popper & Eccles, 1977). In 
each of these cases we might suppose that human brains become lateralized only as a 
consequeace of the development of language, handedness or a perceiving self, and would 
therefore be surprised to find evidence of lateralization in non-human species which lack 
these characteristics. 

A more direct hypothesis of the cause of lateralization has been put forward by Levy | A 


(1977). SEe suggests that a division of labour between the hemispheres ‘almost doubles NC KY es 


the overa l cognitive capacity of the human brain. In other words, space in non-humihy 
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brains is used up by symmetrical duplication of behaviour-controlling processes; 
abandoning this precaution produces a large quantitative bonus. The 'dividing and 
doubling' idea emphasizes the uses of the minor hemisphere. It bears on why there should 
be qualitative differences between the hemispheres, and assigns lateralization a special 
evolutionary role — if it is inferred that the distinctiveness of the human species is achieved 
only by making use of extra capacities attainable with hemisphere differences. 

Semmes (1968) and others have stressed other advantages of having two different kinds of 
processing strategy available — the analytic and serial in one hemisphere, and the diffuse 
and global in the other. Hypotheses such as these may be tested within the human species 
by comparing the abilities of groups of individuals with differing degrees of functional 
lateralization, and there is some support for the view that lack of lateralization, insofar as it 
occurs 1n moderately left-handed people, is associated with reduced cognitive capacities 
(Levy, 1969; Miller, 1971). However, these differences are not as large as one might expect 
if lateralization doubles, or qualitatively improves, human intelligence; indeed there is 
considerable doubt as to whether reliable differences in ability correlated with degree of 
lateralization exist at all (Hardyck, 1977). 

As overwhelming advantages of cerebral asymmetry are not easily established by the 
study of individual differences in man, it is all the more important to investigate other 
sources of evidence as to its origin. Implicit in most theories is the view that lateralization 
is inextricably linked with especially Auman intellectual characteristics — if this is the case, 
the absence of lateralization of brain function in animals needs to be convincingly 
demonstrated. On the other hand, if anatomical and functional precursors to human 
cerebral asymmetry can be found in species other than Homo sapiens, we might be 
provided with new clues concerning its existence in ourselves. I will discuss, first, some 
relevant aspects of comparative neuro-anatomy ; second, what may be taken as firm 
features of human cerebral dominances; and third, evidence concerning the occurrence of 
these features in other animals. 


1 Neuro-anatomical considerations 


In a sense, a necessary condition for cerebral asymmetry is symmetry — in the form of 
paired, roughly symmetrical hemispheres. This condition is satisfied in all vertebrates, 
although both hemispheres are small in lower vertebrates (fish, amphibians and reptiles) 
and make up a much bigger proportion of the brain in higher vertebrates (birds and 
mammals). The vertebrate brain is a paired organ, from the spinal cord up (Dimond, 1972; 
Pearson & Pearson, 1976). This duplication of brain halves is associated with what may be 
termed symmetrical lateralization of sensory and motor functions. For cutaneous sense, 
and motor control, each half of the brain, or spinal cord, usually has dominant 
responsibility for one half of the body. This system is complicated by the fact that each half 
of the nervous system is often connected to the opposite sided sensory and motor devices — 
*crossed-lateral' control. The reasons for the cross-over are obscure. Sarnat & Netsky 
(1974) suppose that primordial defensive reflexes in, for instance, the vertebrate precursor, 
Amphioxus, required that tactile stimulation from one side of the body elicit muscle 
contraction on the opposite side and that cross-overs were thus built into the vertebrate 
nervous system from the start. This does not go far in explaining the great variety of 
decussation in higher vertebrate brains — these may have embryological and functional 
advantages, but their nature remains a mystery. 

The important fact is that it is common for one half of the paired structures to act 
independently of the other in terms of sides of the body. There are consistent differences, 
across vertebrate classes, between the type of symmetrical lateralization employed for the 
various sensory modalites. Olfaction is predominantly ipsilateral, with olfactory nerves and 
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subsequent tracts going through cerebral hemispheres on the same side as the nostril from 
which they begin. For vision the basic plan is for the left visual field to project to the right 
side of the brain (in both mid-brain and forebrain routes) and vice versa. In non-mammals, 
the rule (with some exceptions) is for optic nerves from each side-facing eye to go to the 
opposite side of the brain, while in mammals, partial decussation at the optic chiasma 
ensures that the left visual field from both front-facing eyes reaches the right side of the 
brain (Ebbesson, 1979). With hearing there is a greater degree of duplication of hemisphere 
function in mammals, since each ear projects to both sides of the brain, due to a variety of 
decussations and commissures starting in the medulla (Pearson & Pearson, 1976). A similar 
arrangement of the auditory pathways occurs in birds (Boord, 1968) and reptiles (Foster & 
Hall, 1978). The contralateral auditory projection 1s, however, the most direct, and it has 
been proposed that in the event of conflict between information from the two ears (as in 
the dichotic listening paradigm with human subjects) the ipsilateral channel may be actively 
suppressed in mammals (Aitkin & Webster, 1972). Despite this degree of contralateral 
advantage, it must be stressed that, in terms of hemispheric specialization, audition offers a 
rather different set of possibilities than vision. A single hemisphere receives information 
from both ears, and may therefore operate on all auditory input, and make comparisons 
between right and left auditory fields, without the intervention of the other hemisphere. For 
instance, a single hemisphere could in theory localize a sound source at any point in the 
horizontal plane, or decode speech entering either ear, even in the absence of cerebral 
commissures. By comparison, for visual stimuli, a single hemisphere acting alone is 
restricted to 1ts own half of the visual field. A mammalian hemisphere — and to a limited 
extent, that of some other vertebrates (Ebbesson, 1970) — has the advantage of being able 
to compare inputs from the two eyes, arising from one point in the visual field, thus 
deriving stereoscopic depth. But the nature of the mammalian chiasma tends to retain the 
left-visual-field-to-the-right-hemisphere system which occurs in lower vertebrates with 
side-facing eyes. 


1.1 Integrating symmetrically lateralized information 


This brings up a general requirement which follows from the symmetrical separating out 
of left and right sensory and motor information. For vision, it would clearly be a 
disadvantage, even for the simplest vertebrate, if an object detected in the left field had to 
be detected anew when the object, or the animal, moved round so that the stimulus 
appeared on the right (see Walls, 1942). To prevent this sort of double vision, it would be 
necessary that information as to the identity of the object 1s passed from one side of the 
brain to the other. It is usually taken for granted that facilities for such a transfer of 
information exists in man and other mammals, in the shape of the corpus callosum 
(Pearson & Pearson, 1976). However, it has also been established that some inter-ocular 
transfer of learned visual discriminations takes place in goldfish (Yeo & Savage, 1975; Ingle 
& Campbell, 1977), and pigeons (Cuenod, 1974; Zeier, 1975) by means other than 
transmission via corpus callosum. The method involves training with one eye covered, and 
subsequent testing for the performance with only the other eye available, with or without 
surgical lesions of putative transfer routes. In goldfish, sectioning the tectal (midbrain) 
commissures does not necessarily stop inter-ocular transfer of shape discrimination (Yeo & 
Savage, 1975) but alternative forebrain commissures appear to be necessary (Ingle & 
Campbell, 1977). In the pigeon, Cuenod (1974) showed that transfer of colour and shape 
took place via the supra-optic decussation, from the thalamus to contralateral hemisphere, 
and not via mid-brain commissures. Electrophysiological recordings establish that each 
hemisphere of the pigeon contains neurons responsive to ipsi- and bilateral, as well as 
contralateral inputs: it appears that in birds a single hemisphere can store and compare 
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information from both eyes, and from both halves of the visual field (Zeier, 1975; Fox et 
al., 1977). 

The point is that, together with symmetrical lateralization of sensory input, vertebrates 
typically arrange for some kind of bilateral availability of relevant information. Corballis & 
Beale (1970, 1976) assumed that the transfer is so complete that animals have difficulty in 
discriminating mirror images about the mid-line. With some kinds of displays such 
difficulties can occur in man as well as other vertebrates, but information as to whether a 
stimulus is on the left, or on the night, must be retained in the case of turning responses 
and the body-image. Using the tactile sense, an animal may scratch its right side with 
either a right or a left limb, and does not normally attempt to scratch bilaterally in 
response to a unilateral itch. 

The presence of a large corpus callosum indicates that the human brain is particularly 
well equipped for inter-hemispheric communicativeness. However, in higher mammals, the 
corpus callosum connects some regions of the cortex very well but others very badly, if at 
all (Ebner, 1969). It should first be emphasized that the small hemispheres of lower 
vertebrates may be just as well-connected as the larger cerebral structures of mammals, 
albeit via different pathways. All vertebrate forebrains possess the anterior commissure 
(which contains several distinct inter-hemispheric tracts) and a ‘posterior pallial 
commissure’ (the psalterium between hippocampi in mammals) together with connections 
at diencephalic levels. Only placental mammals utilize a true corpus callosum, but 
marsupial mammals have very large anterior and hippocampal commissures, and in some 
cases ‘aberrant bundles’ parallel to these (Kappers et al., 1936). The addition of the corpus 
callosum may be seen as retaining the usual ratio of inter-hemispheric exchange for larger 
hemispheres rather than increasing the extent of hemispheric cross-talk. The degree of 
connectedness of marsupial hemispheres can be anatomically assessed by plotting the 
distribution of degenerating fibres following transection of their forebrain commissures. 

By this technique, Ebner (1967) demonstrated that there is an even spread of 
commissural connections over the entire surface of the hemispheres of the Virginia 
opossum, a pattern identical to that found in the hedgehog, a primitive placental which has 
the advantage of a fully fledged corpus callosum. But when the same survey was made of 
the cerebrum of cats, raccoons and rhesus monkeys — ‘higher mammals’ with a more 
substantial corpus callosum than the hedgehog — commissural connections were found to be 
unevenly distributed, with some areas of the hemispheres, including parts of the somatic 
and visual cortex, completely free of such fibres (Ebner & Myers, 1962a; Ebner & Myers, 
1965). 

There is some support, then, for the proposition that the development of the corpus 
callosum in mammals goes hand in hand with an increase in areas of cerebral neocortex 
which lack inter-hemispheric coupling. It should be noted that, although marsupials and 
primitive placentals have relatively homogeneous distribution of commissural fibres, some 
reptiles possess an irregular pattern more like that of the cat and monkey (Ebner, 1969). It 
should also be said that Dimond (1972) proposes a rule opposite to that just suggested: he 
asserts that higher mammals have more widespread commissural interconnections than 
lower mammals. However, he was Jed to this assertion by the limited inter-hemispheric 
projections of the raccoon. As a specialized carnivore (with primate-like performance in 
laboratory tests) the raccoon 1s an advanced mammal, and as such was used by Ebner 
(1969, p. 247) to support the conclusion that ‘commussure-free cortical regions of more 
specialized eutherian (placental) mammals appear at a later stage of the development of 
neocortex’. The raccoon’s hands are well adapted for catching fish and crustaceans by 
touch (Whitney & Underwood, 1952), and its powers of visual abstraction as measured by 
success at learning sets Johnson & Michels, 1958) and patterned string problems (Michels 
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et al., 1961) appear to be considerable. As with cats and monkeys, tactile and visual 
prowess is accompanied by the absence of commissural connections to the primary visual 
and touch areas'of the two hemispheres. In the raccoon, the region of somatosensory 
cortex devoted to the hand, which lacks commissural terminals, is extremely large (Welker 
& Seidenstein, 1959). 

The generality of some degree of point-to-point cross-forebrain associations in lower 
vertebrates can be demonstrated by the same phenomenon that necessitates surgical 
callosal sections in human epileptic patients — artificial induction of an epileptic focus in the 
forebrain of teleost fish, amphibians or reptiles, results in the propagation of a secondary 
focus at the symmetrically opposite position in the contralateral hemisphere (Servit & 

` Strejckova, 1970). 

One may tentatively conclude, then, that vertebrate brains are organised according to a 
symmetrically lateralized plan which matches one side of sensory input or motor output 
with one side of the brain. There are differences in whether the same or opposite side of the 
brain dominates lateral input and output, although crossed lateral control 1s the norm. For 
all modalities there are some arrangements for-bilateral integration, and symmetrical 
interchange of information. In general, one side of the brain mirrors the other, functionally 
and anatomically. This is the background against which any functional or structural 
asymmetry stands. out. 


1.2 Physical asymmetries of the human brain 


It is safe to say that interest in functional asymmetries has preceded the emphasis of human 
neuro-anatomical imbalance. But since Geschwind & Levitsky (1968) suggested that 
disparities between the surface area of the left and right planum temporale on the dorsal 
‘horizontal extension of the temporal lobe was related to left-sided localization of speech 
perception, considerably more data have become available (e.g., Geschwind, 1974; Rubens, 
1977; Galaburda et al., 1978). The planum temporale is a triangle of secondary auditory 
cortex, and is in the vicinity of the neurologically identified Wernicke's area which, when 
damaged, impairs speech comprehension (but see Bogen & Bogen, 1976, ‘Wernicke’s 
region — Where is it?’). Geschwind & Levitsky found that out of 100 brains, 65 had a 
significantly larger planum temporale in the left hemisphere, compared to the right. Of the 
remaining 35 brains, 11 had a larger area on the right, and 14 were classified as 
symmetrical. Others have followed 19th-century investigations in measuring the length of 
the Sylvian fissure, in which the planum temporale lies (Yeni-Komshian & Benson, 1976; 
Rubens, 1977). The Sylvian fissure tends to be longer in the left hemispheres than in the 
right, and in 25 of 36 brains (69 per cent) examined by Rubens et al. (1976) it continued 
further horizontally on the left before bending upward. The human brain thus appears to 
be typically, if not universally, asymmetrical in the region of the temporal/parietal 
boundary. Moreover, the primacy of the left planum temporale is already evident in foetal 
and neonatal brains (Wada et al., 1975; Witelson & Pallie, 1973). 

Is this because the entire left hemisphere is larger than the right? If anything, the minor 
right hemisphere weighs more than the left (Lemay, 1976), and there is evidence that the 
right hemisphere leads the left in prenatal growth (Chi et al., 1977). Campain & Minckler 
(1976) found that primary auditory cortex, immediately in front of the planum temporale, 
tends to be more extensive on the right, although there were enormous individual 
variations in their sample. À more general demonstration of the nature of hemispheric size 
differences has been obtained by a radiological brain scan technique (Lemay, 1976). This 
does not at present allow for measurement of bounded areas such as the planum temporale, 
but provides a coarse-grained picture of living brains in situ. Since fixed brains may suffer 
up to 40 per cent shrinkage, and mechanical distortion, this is valuable additional evidence. 
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The results may be summarized as follows: the front of the human brain is wider on the 
nght; the back of the human brain is wider on the left. This applied to individuals 
categorized as right-handed. For the left-handers, the frontal lobe still tended to be larger 
on the right, but was more often equal, and the right occipital lobe was also often the 
larger of the two. 

The contention that left-handers have less functional asymmetry than right-handers, 
rather than the opposite form of asymmetry, thus appears to find support in Lemay's 
EMI scans. 

However, it should not go unremarked that the correlation between left-right inequalities 
of size and left-right assignment of function is somewhat less than perfect. Of the four 
major asymmetries of function, three (language reception, language production and fine 
motor control) are localized in the left hemisphere, whilst the fourth (visual-spatial 
processing) is thought to be a right-hemisphere specialization. Yet, of these four instances, 
only in one, language reception, does the physical inequality between the hemispheres go in 
the obvious direction, insofar as the planum temporale is larger on the 'correct' side in 
60—70 per cent of the brains studied. Although language production relies on the left 
hemisphere ' Broca's area’, this area, along with the rest of the frontal lobe, is larger on the 
right (Wada et al., 1975). If other frontal motor areas are larger in the right hemisphere, as 
Lemay's data would suggest, then this is the wrong direction for motor dexterity. Finally, if 
visual-spatial functions are better represented in the right hemisphere, one might have 
expected an enlargement of the right occipital lobe, but it 1s the left occipital lobe which is 
physically dominant (Lemay, 1976). 

Thus, although any physical left-right asymmetries may be taken as signs of functional 
lateralization, the exact implications of the various hemispheric size inequalities in man are 
by no means clear-cut (Whitaker & Ojemann, 1977). It may be observed that if the 
* division of labour' between the hemispheres were an entirely equitable one, pronounced 
distinctions between the functions of the two hemispheres might occur in the absence of 
any difference in gross physical dimensions. 


1.3 Physical asymmetries in non-human brains 


Systematic searches for physical hemispheric asymmetries in animals which might 
correspond to those observed in man have, reasonably enough, been confined to primates. 
Generally speaking, it is assumed that non-human brains are bilaterally symmetrical 
(Dimond, 1972). However, there are indications that the anthropoid apes (chimpanzee, 
orang-utan, and gorilla) exhibit anatomical hemispheric asymmetries that are similar to, 
though smaller than, those observed ın man. 

Cunningham (1892), having observed that the upward turn of the Sylvian fissure was 
more acute on the nght in human brains, found a similar disparity in the brains of 
chimpanzee, orang-utan, and baboon (a large Old World monkey), but not in those of a 
macaque (a typical Old World monkey, smaller than the baboon). Yeni-Komshian & 
Benson (1976) found a corresponding asymmetry in the length of the Sylvian fissure in 
chimpanzees. For 25 chimpanzee brains the mean length of the Sylvian fissure was greater 
on the left than on the right — a minute (45-7 mm versus 43-7 mm) but statistically 
significant difference. For the rhesus monkey (a species of macaque) there was a similar 
difference in the mean length of the fissure, but it did not reach statistical significance. A 
difference between monkeys and apes is suggested by other studies. Wada et al., (1975) 
found no temporal lobe asymmetries in rhesus monkeys or baboons. Lemay & Geschwind 
(1975) report that the point of termination of the Sylvian fissure tends to be higher on the 
right than on the left in the chimpanzee, gorilla and orang-utan, but that this difference is 
relatively rare in the New World and Old World monkeys. Within the great apes, the 
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orang brains were most, and the gorilla’s least, asymmetrical. A considerable proportion of 
the apes did not show the asymmetry (out of 27 ape brains, only 16 had left/right Sylvian 
fissure height differences of more than 3 mm). 

Given the vagaries of cortical growth (Richman et al., 1975) it would be a matter for 
comment if hemispheric asymmetries in individual brains within any species were not 
observable, with sufficiently precise measurement. What is of interest is whether systematic 
forms of asymmetry appear in species other than man. It appears from the investigations 
quoted above that the apes may exhibit left/right Sylvian fissure disparities of the same 
kind as that found in man, but to a lesser degree. In contrast, at present the data for 
monkeys suggest that their Sylvian fissure left/right asymmetries are not comparable to the 
human case. 

However, the Sylvian fissure is not necessarily the most appropriate test of brain 
asymmetries. Cain & Wada (1979) have suggested that it would be useful not to 
concentrate exclusively on the temporal lobe, after finding frontal lobe asymmetries in 
baboons of the same size as those found in man (with larger measurements on the right). 
Unfortunately they were not able to compare the occipital lobes, which were damaged, but 
their data suggest that a more systematic assessment of brain asymmetries than those 
previously conducted may demonstrate many parallels between the physical asymmetries of 
the human brain and those of other primates. 

Other mammalian species have not been as thoroughly investigated as primates, and in 
the absence of data it is usually assumed that marked left/right hemispheric asymmetries 
are absent. Individual differences in hemisphere size and fissural pattern are, of course, quite 
common. Webster (1977), for instance, examined the brains of 33 cats, and found, firstly, 
that 18 could be categorized as symmetrical in fissural pattern while the other half (15) 
were asymmetrical, and secondly, that two-thirds (10) of the asymmetrical brains differed in 
visual, rather than other, areas (although no left/right imbalance was reported). Individual 
brains of various mammalian species may differ similarly in degree of asymmetry, and thus 
in degree of individual left/right imbalance, with as yet undetermined implications for 
species-characteristic hemispheric inequalities (Royal College of Surgeons, 1902). 

A possible exception to the general symmetry of the skeletons of mammalian species 
(apart from higher primates) occurs 1n the suborder of Odontocete whales. These differ 
from Mystocete (whalebone) whales, and indeed from most other mammals including man, 
in having markedly asymmetrical skulls (Slijper, 1962; Ness, 1967; Tomilin, 1967). Not all 
species in the suborder show the same degree of cranial asymmetry, which is most readily 
observable in the width of the jaw-bones. Just how readily is illustrated by the case of 
Bottlenose whales, whose right premaxillary bone may be twice as wide as the left. Clearly 
there may be special reasons for these peripheral asymmetries, unconnected with functional 
asymmetries in the brain; for example, asymmetries in the nasal passages may be connected 
with the use of these passages for sound production in some species. However, marked 
differences in the left and right nasal, premaxillary and maxillary bones may be 
accompanied by such cranial asymmetries as, for instance, in the temporal-parietal fossa 
and the occipital crest. In all cases it is the right side of the skull which is better developed. 
There is some tendency for the degree of asymmetry to increase with skull size. The 
Narwhal has a particularly asymmetrical skull, as well as having a single (left) tusk (an the 
male), but degree of asymmetry does not seem related to the presence of the tusk (Ness, 
1967). 

The most familiar toothed whale, the Bottlenose Dolphin (Tursiops Truncatus), has a 
relatively evenly developed skull, although Tomilin (1967) gives mean figures for the 
intermaximillary bone in this species which suggests a sex difference in asymmetry. It is 
hazardous to infer brain characteristics from peripheral skull measurements, but it seems 
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more probable than not that systematic physical hemispheric asymmetries of some kind 
occur in most species of Odontecete whales. Due to the ‘hyperfissurization’ of whale 
brains, examination of fissural patterns would be arduous. Published photographs and line 
drawings of cetacean brains make it quite clear, however, that the fissural patterns of one 
hemisphere are not always mirror images of those on the other (Slijper, 1962; Morgane & 
Jacobs, 1972). 

In view of the inferences made from fossil human skulls, which are also larger on the 
right (Abler, 1976; Lemay, 1976), and in the absence of more precise comparisons of the 
left and right hemispheres of the toothed whales and dolphins, it can only be said that the 
existence of species-characteristic cerebral asymmetries in this mammalian taxon remains a 
possibility. It is unlikely that many people have examined the skulls or brains of any 
non-primate species to see if one side differs from the other by a factor of 1 or 2 per cent. 
Given the almost universal asymmetry of the other internal organs, slight asymmetries in 
the skull or brain of any vertebrate would not necessarily be a matter for comment. Some 
vertebrate brain asymmetries may therefore remain to be discovered. In crocodilians, for 
instance, very striking individual asymmetries of the skull are not uncommon, but few 
systematic surveys have been made of this phenomenon (Iordansky, 1973). A rare 
comparison of right and left hemispheres in the rat (Diamond et al., 1975) suggested that 
the cortex of the right hemisphere may be between 1 per cent and 10 per cent thicker than 
the left, depending on which of seven cross-sections is measured, from the sixth day of life 
onwards. Although the right-hemisphere supremacy was maintained throughout the life 
span of the animals, the maximum asymmetry, at each of the seven sections, was observed 
at some point during the first 6 weeks. 

All the physical asymmetries discussed so far, human and animal, have concerned overall 
or surface features of the cerebral hemispheres. It 1s more difficult to examine internal, 
subcortical brain characteristics, but these are generally assumed to be bilaterally 
symmetrical. There is a well-known exception, however, in the diencephalon of lower 
vertebrates. The habenular nuclei, situated in either side of the third ventricle, are 
noticeably unequal in size in cyclostomes (lampreys and hagfish), sharks, and some species 
of teleost fish and amphibians. The left habenular is better developed in sharks, and in 
those species of frog which have an asymmetry, but the right-hand nuclei are the larger in 
cyclostomes, and the favoured side varies from species to species in teleost fish (Kappers et 
al., 1936; Braitenberg & Kemali, 1970; Morgan et al., 1973). When it occurs, this 
anatomical asymmetry is quite unequivocal, but its functional implications are unknown. 
The habenulae are traditionally classed as olfactory correlation centres (Kappers et al., 
1936) as they have fibre tract connections with the olfactory bulbs, the amygdaloid, septal 
and hippocampal regions, with the thalamus, with midbrain centres, and with each other 
(via habenular commissures). Habenular nuclei are retained in mammals and birds, and 
their connections remain ‘surprisingly constant throughout the vertebrate series’ (Kappers 
et al., 1936, p. 1264). They can obviously be regarded as an integral part of the limbic 
system (Pearson & Pearson, 1976). Since the details of habenular function in any species 
are not known, it is difficult to draw conclusions from the observed physical asymmetries, 
but it would hardly be surprising if limbic system function were in some way lateralized in 
the species in question. To the extent that human emotionality is held to be asymmetrically 
controlled, functional lateralization in the human limbic system might also be expected. 

Functional lateralization in the human diencephalon is occasionally referred to in the 
context of verbal performance: because of its extensive two-way connections with the 
cortex, the thalamus ought to reflect, if not determine, lateralization of sensory functions, 
and electrical stimulation of the thalamus in human subjects suggests a degree of left 
dominance of verbal processes (especially for the pulvinar and ventro-lateral nuclei — 
Ojemann, 1976; Riklan & Cooper, 1977). It was noticed by Haight & Neylon (1978) that 
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occasional individuals (about 10 per cent) in a sample of brush-tailed possums, had serious 
asymmetries in the thalamus, but not elsewhere in the brain. The diencephalon appears 
susceptible to physical asymmetries which may be embryological in ongin. 


1.4 Sensory and motor pathways 


It can safely be assumed that the innervation of sensory and motor organs has identical 
physical characteristics on the left and right side of the body, as a general vertebrate rule, 
and that the same applies to major sensory and motor pathways within the brain. 
However, there may be interesting exceptions. Cobb (1964) reported that for a species of 
owl and for the South American oil-bird, midbrain auditory projections were larger on the 
left than on the right. Owls make use of hearing for nocturnal location of prey, and several 
species have pronounced asymmetries of the external auditory meatus, consistent within the 
species, which may assist this (Erulkar, 1972). The oil-bird also flies under low levels of 
illumination, and employs a bat-like echo-location system. 

A neurochemical, rather than an anatomical, asymmetry has been reported for the 
nigrostriatal motor system of rats (Zimmerberg et al., 1974; Glick et al., 1977). Normal 
rats have a 10-15 per cent difference between the dopamine content of the ‘high’ and low 
side of the nigrostriatal pathway, but the ‘high’ side is as often on the left as on the right. 
A large dose of amphetamine increases the disparity between ‘high’ and ‘low’ sides to 
25-30 per cent, and the difference is accompanied by behavioural side preferences (see later 
section). This is another example of individual variations in left/right balance within a 
species, with the bias towards left or right varying randomly. It is conceivable that random 
irregularities make individual variations in sensory and motor efficiency on the left and 
right a common occurrence, without contributing to the behavioural characteristics of the 
species. 

For rare examples of an entirely different state of affairs, where a species has a 
characteristic asymmetry between left and right motor systems, one has to fall back on the 
flat-fish, where left corresponds to bottom, and right to top, or vice versa according to 
species (Neville, 1976). Even rarer 1s a pronounced sensori-motor asymmetry in an upright 
species: the New Zealand Wryneck is a bird with its entire bill curved sharply to the right, 
and as a consequence it must adopt a rigidly unilateral approach to the business of turning 
over pebbles (Thomson, 1964). 


2. Functional asymmetry in man 
2.1 Handedness 


There 1s little doubt that the study of functional asymmetry in animals is predicated on the 
existence of a reliable human model. Agreement on the nature and significance of 
functional asymmetries in man is not universal; but it is necessary to isolate features of the 
phenomena that can serve as the basis for cross-species comparisons. The most visible 
asymmetrical human attribute 1s, of course, right-handedness. 

Human handedness is not quite so fixed and unitary a trend as the quotation of a 90 per 
cent incidence of right preference implies (Annett, 1970, 1972). Our apprehension of 
handedness is influenced by its most obvious manifestation — handwriting. This skill is 
severely lateralized (Annett, 1970) in that writing with the left hand is rare within 
population samples, and very few individuals can be said to have equal proficiency with the 
non-preferred hand. However, it is rather unlikely that the human species 1s genetically 
specialized for unilateral handwriting as such, since it is not a skill which has had much 
time to operate as a selection pressure. The most lateralized skill of all those assessed by 
Annett (1970) was hammering, which has a long enough history to be a more plausible 
candidate for special evolutionary selection (e.g., Oakley, 1972). 

Annett (1972) suggests that there is a large group of individuals of ‘mixed handedness’ 
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who perform some tasks (e.g., writing) with one hand, other (e.g., throwing, or using 
scissors) with the opposite hand, and some skills (e.g., unscrewing the top of a Jar) with 
either hand Human handedness, Annett concludes, is a continuous dimension, though 
heavily skewed towards right-handedness. 


Preference, capacity and transferability 


It would be helpful to know whether human handedness consists of a difference in 
preferred uses to which the two hands are put, or more fundamental differences in the 
capacities of the left and right hands (or rather, of the contralateral hemispheres that 
control them). In the context of theories of cerebral dominance, it could be assumed that 
the right hand is preferred for detailed work, because only the left hemisphere is capable of 
supporting highly organized sequences of limb and finger movements. Forcing the right 
hemisphere to undertake control of activities which are properly the province of the left 
should therefore result in clumsier performance. This is a stronger proposition than simple 
preference — individual animals, or even particular animal species might exhibit arbitrary 
preferences without an underlying disparity of hemispheric capacities. 

The only thing which can be said with any confidence about human handedness and 
hemispheric capacities ıs that the capabilities of the minor hemisphere might not be as 
limited as the bare facts of hand preference might suggest. The evidence for at least 
adequate capacities of the minor hemisphere for control of manual skill can be drawn from 
the existence of individuals with mixed handedness (Annett, 1970) and from forced use of 
the non-preferred hand in some skills, notably the playing of musical instruments (Oldfield, 
1969). Individuals who write with one hand and perform high level skills with the other (for 
example, in bowling a cricket ball, or using a tennis racket) demonstrate that motor 
incompetence of one of the hemispheres need not be regarded as a defining feature of 
human handedness. There is no evidence that ‘left-handers’ have difficulties in playing 
‘right-handed’ musical instruments (Oldfield, 1969). Most musical instruments require fine 
motor control of both hands, which in itself indicates minor hemisphere potential. Some, 
such as the piano, seem to require faster or more delicate performance by the right hand, 
but left-handers are not known to be disadvantaged by this arrangement. À few 
instruments, in particular the violin and similar stringed instruments, require very fine 
control of left-hand finger movements, yet are not the exclusive preserve of the left-handed. 
It is arguable that movements of the left hand of a virtuoso violinist represent the high 
point of human dexterity. Oldfield (1969) suggests that this anomaly results from a 
left-hand preference for holding, exhibited initially when instruments were played by 
right-hand manipulation of open strings, and retained when changes in technique 
demanded detailed movements by the holding hand. Similarly, when the French horn 
acquired keys, the left hand, originally limited to holding the instrument while the right 
hand altered pitch by occluding the open horn, adopted the 'dextrous' job of key fingering. 
Oldfield wonders whether the right arm may be more adept at ballistic movements, with 
left-side capabilities limited to modification of finger holds. But accurate ballistic use of the 
left hand, by right-handers, is evident in the skilled playing of stringed instruments (and in 
certain styles of boxing). 

It is arguable, therefore, that human handedness is largely a matter of preference, rather 
than a consequence of biologically programmed inadequacies of the minor hemisphere. As 
a species, humans are right-handed. However, there is a small minority of left-handers, 
both familial (fairly ambidextrous) and non-familial (strongly left-handed — Hardyck & 
Petrinovich, 1977). Both kinds of left-handers are recognizably human, even if suffering 
from minor intellectual impairments (Levy, 1969; Hardyck et al., 1976). For the purpose of 
comparisons with other species it is all the more appropriate to consider human 
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handedness as a continuous distribution of preferences, skewed towards the right (Annett, 
1970, 1972, 1975). 


2.2 Language 


It 1s widely held that human language is controlled by the left side of the brain. The 
strongest evidence for this is neurological — individuals who suffer left-sided damage are 
more likely to also show language deficits than patients with right-hemisphere damage 
(e.g., Milner, 1974; Coughlan & Warrington, 1978). Post-mortem confirmation of side of 
lesions shows that very few patients with language deficits resulting from brain damage 
have lesions in the right hemisphere (Zangwill, 1960). Other diagnostic techniques support 
this finding, and even left-handers are found to suffer more impairment of language 
performance after left-sided hemispheric insult (Milner, 1974). 

The neurological literature on various sorts of language dysfunction after brain damage 
is exceedingly complex: individual differences abound (Ojemann & Whitaker, 1978), and 
almost every conceivable permutation and combination of impairments and preservations 
of speech production, speech comprehension, reading and writing may occur (Goldstein, 
1948; Gardner, 1977). Studies of patients who have undergone section of the cerebral 
commissures for the relief of epilepsy have confirmed that the left hemisphere 1s more 
important for language functions than the right (Sperry, 1968; Gazzaniga, 1975) but also 
support the contention that the right hemisphere normally possesses some limited abilities, 
especially for language comprehension (Zaidel, 1978). 

We take it, then, that left-hemisphere dominance of language functions is a 
species-specific human characteristic; but with qualifications. The right hemisphere 1s quite 
capable of accomplishing human language at almost, if not completely, normal level, if the 
left hemisphere is damaged at an early age (Basser, 1962; Dennis & Whitaker, 1976), and 
may play some part in language function in all individuals (Searleman, 1977; Zangwill, 
1978; Coltheart, 1979). 


2.3 Perception 
Hearing 

It 1s difficult to isolate hearing generally from speech reception. When speech is played 
into the left and right ears simultaneously, in dichotic listening studies, normal 
right-handed human subjects frequently show a right-ear advantage (Darwin, 1974; 
Springer, 1977). The difference is not a large one; given the success of the mammalian 
auditory pathways in directing the input from each ear to both hemispheres (Whitfield, 
1967) a large difference would not be expected and the technique is not without its cntics 
as a method for diagnosing lateralization of speech functions (Berlin, 1977; Colbourn, 
1978). Hearing generally is resistant to impairment from unilateral temporal lobe damage 
(Whitfield, 1967; Adams & Victor, 1977); but there are some reports that right-hemisphere 
lesions impair memory for complex non-verbal sounds (Ravizza & Belmore, 1978). 

The evidence from professional music performers or composers who suffer unilateral 
brain damage is not sufficiently consistent to allow an assignment of music perception to 
one or other of the hemispheres (Gardner, 1977) although the momentum of cerebral 
dominance theories would place it on the right (Ornstein, 1972). The right hemisphere may 
have specializations for hearing, but these can show individual, and cultural, variations 
(Tsunoda, 1975). 


Vision 
There is a large amount of data on the performance of normal subjects in perceiving stimuli 
presented to their left or right visual fields (Hardyck, 1977; Springer, 1977) and a 
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somewhat smaller amount concerned with the visual abilities of the left and right 
hemispheres of commissurotomy patients (Gazzaniga, 1975). The findings fall into two 
categories: the left hemisphere (right visual field) is better, or quicker at recognizing brief 
alpha-numeric stimuli; and the right hemisphere is better at analysing more complex 
visual-spatial relationships. 

In the case of normal subjects viewing visual stimuli in the left or right. visual field, 
differing responses are of course a matter of degree — significant differences are found with a 
fair degree of regularity, though not universally, but the differences are small (Hardyck, 
1977). Given that the stimuli are usually words or numbers, it is difficult to isolate an 
asymmetry in human vision from the left hemisphere language advantage. 

Evidence for a right-hemisphere superiority in visual-spatial perception rests on the 
performance of the individual hemispheres of split-brain patients (Levy, 1969; Nebes, 1971, 
1974) and of patients with unilateral brain damage (Milner, 1974). If a three-dimensional 
shape held in one hand had to be matched with an ‘unfolded’ two-dimensional 
representation of the shape, presented visually, the left hand (right hemisphere) was more 
successful than the nght hand for nght-handed commissurotomy patients (Levy, 1969). 
Matching an arc of a circle with the circle to which it belonged (from a choice of three) was 
more successful if both arc and circle were presented to the right hemisphere (either by 
touch or sight, or in a combination of the two) than to the left, in four out of five similar 
patients (Nebes, 1971). 


2.4 Emotionality 


A final area in which the human hemispheres have been supposed to be differentially 
involved is the expression, perception, or experience of emotion. There is some evidence to 
support the view that emotions are expressed more intensely on the left side of the face 
(Sackheim et al., 1978). Although it would appear to be counter-productive, if that is the 
case, observers seem to pay more attention to their left visual field (and thus to the less 
active side of the observed person) when making judgements of the emotional content of 
facial expression (Campbell, 1978). More generally, emotional life is assigned to the control 
of the right hemisphere (Gainotti, 1972; Ornstein, 1972; Gazzaniga, 1975). ` 


2.5 Sex differences 


Theories of human sexual differences are not without interest in the context of a 
comparative survey, insofar as other species, with more pronounced sexual dimorphisms, 
might be expected to reveal more exaggerated sex differences in behaviour. In terms of the 
behavioural categories discussed so far, human sex differences have been proposed. 
Incidence of right-handedness is higher in the human female than in the male (Hardyck & 
Petrinovich, 1977). Females are found to be better than males at speech-related tasks 
(Maccoby & Jacklin, 1975); females may perform less efficiently than males at 
visual-spatial tests (Witelson, 1976) and are generally held to be more emotional than 
males (Maccoby & Jacklin, 1975). These findings do not fall into place ın terms of either 
more or less lateralization — but it ıs usually argued that females are less lateralized than 
males (e.g., Witelson, 1976; McGlone, 1977). If one were to take all proposed sex 
differences at face value, it would seem, not that females are necessarily less lateralized, but 
that they devote the left hemisphere very successfully to speech rather than to logic and 
mathematics, and use the right hemisphere for emotion instead of spatial perception. 


2.6 Conclusions 


The tendency for individuals to perform tasks with only one hand, and for this hand to be 
the right hand, ıs extremely marked ın the human species. The evidence that species-specific 
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vocalizations are produced and interpreted by one hemisphere rather than both, and that 
the dominant hemisphere in this sense is usually the left, is very strong. By comparison, the 
evidence for other asymmetries in cerebral function, entirely unrelated to language or 
handedness, is less robust, but it is thought that the right hemisphere may be better than 
the left at complex visual-spatial perception, and may carry a greater responsibility for 
various forms of emotionality. 


3. Functional asymmetry in non-human vertebrates 


In asking whether anything similar to human functional lateralization may be observed in 
other species, two areas — one-sided motor preferences, and unequal use of the cerebral 
hemispheres in vocalization and communication — appear to have prime importance, and 
sections under these two headings follow. Most of the remaining findings to do with human 
functional asymmetry can be discussed in terms of perception, or emotionality, and a final 
section covers animal work under these headings. 


3.1 Lateralization of motor performance 


Of work which may be related to functional lateralization in animals the largest category 
by far concerns putative analogues to human handedness, in terms of forelimb preferences. 
Many species of mammals, particularly rodents and primates, make regular use of the 
forelimbs for moving external objects, in the course of finding food, and carrying it to the 
mouth, or in nest-building and digging. Some carnivores, notably bears and raccoons, are 
well adapted for manipulative use of the forelimbs, and although little work has been done 
in these species, use of the forepaw in the domestic cat has received some experimental 
attention. The data available are therefore not comprehensive, being heavily biased towards 
rats, cats and monkeys. Use of hind limbs for object manipulation does occur in some 
species of primate, and in bats and birds, but is less studied, although Friedman & Davis 
(1938) reported that individuals from several species of parrot showed a preference for the 
use of the left claw to hold food. 


Forepaw preferences in rats and mice 


The standard sort of test for animal forepaw preferences and typical results are described in 
the monograph by Peterson (1934); these tests were used in the work of his contemporaries 
and collaborators (e.g. Yoshioka, 1930; Peterson, 1951; Peterson & McGiboney, 1951; 
Peterson & Devine, 1963). The test requires the animal to reach for food contained in a 
tube or dish arranged in such a way that single-paw reaches are elicited. The relative 
number of right-limb and left-limb reaches in individual animals 1s easily measured, and the 
effect of experimental variables on such numbers may be assessed. The conclusions 
reported by Peterson (1934) apply to subsequent work on the rat to the work of Collins 
(1977) with mice. 

(a) Most rats show a strong paw preference in food reaching. Although 75 per cent of 
reaches with one hand 1s sometimes used as a criterion for handedness, individual animals 
commonly show almost exclusive use of one paw during daily sessions and across sessions 
separated by months. On these grounds rats may be said to satisfy one of the criteria of 
human handedness: most individual rats consistently prefer one limb over the other, in 
some tasks. It is noteworthy that fatigue, or response inhibition, does not prevent these 
strong preferences from emerging. 

(6) Rat populations contain equal numbers of right-pawed and left-pawed animals on the 
food-reaching task: the distribution of handedness shows no sign of being skewed, and in 
this sense rat populations are distinctly different from human populations. 

(c) Rat samples contain a small proportion of individuals that alternate from one paw to 
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the other in the food-reaching task. Less than 10 per cent of a sample is usually | 
ambidextrous in this way, with the remaining 90 per cent equally divided between the left- 
and right-handers. 

(d) Rats that prefer one paw in the food-reaching task do not necessarily prefer the same 
paw in other tasks (Peterson assessed lever pressing and latch-opening). 

(e) The proportion of animals left- or right-pawed for food reaching cannot be altered by 
selective breeding. 

The results obtained by Collins (1968, 1969, 1975, 1977) with inbred strains of mice 
reaching down a tube for food do not conflict in any way with the above conclusions for 
rats. The fact that inbred strains (which may be considered to consist of individuals which 
are genetically very similar to each other) contain equal proportions of left- and 
right-handers, implies that a random process determines whether an individual will show a 
left or right preference. Collins (1969) found no change in distribution of preferences after 
three generations of selective mating for left or right preferences, but did not breed for lack 
of preference (ambidexterity). His view, however (Collins, 1977), is that the degree of 
laterality (for food-reaching behaviour) though not the direction of laterality, is subject to 
genetic variability. There 1s little evidence for this, since different strains vary only very 
slightly in degree of laterality (Collins, 1968); but females of the strain most studied by 
Collins (1977) consistently show a higher degree of lateralization (i.e. have stronger paw 
preferences with more females making at least 98 per cent of reaches with the same limb) 
than males. 

A similar sort of food-reaching handedness to that observed in rats and mice may be 
characteristic of marsupials. Megirian et al. (1977) studied reaching responses in an 
Australian species (the brush-tailed opossum, Trichosurus vulpecula) using a centrally 
positioned narrow tube facing the animal, (very like Collins' apparatus for mice). Of the 78 
animals tested, 51 per cent had a ‘consistent and predominant’ preference for using the left 
forepaw, 45 per cent had the same kind of preference for the right paw, with only 4 per 
cent (three subjects) being classed as ambidextrous. 


Constitutional and environmental determinants of food-reaching laterality 


Since the distribution of handedness for food-reaching in these lower mammals appears to 
be random, with roughly equal numbers of left- and right-pawed animals, it is possible to 
entertain an environmentalist null hypothesis; the choice of limb on the first trial is random 
for each individual, but success on the first trial predisposes the choice of the same limb on 
the next occasion, with cumulative learning accounting for the observed individual 
preferences. An alternative hypothesis is that a random embryological or developmental 
process makes equal numbers of individuals constitutionally left- or right-handed (Collins, 
1977; Morgan, 1977). 

Asymmetrical layout of the testing space has a strong immediate influence on observed 
choice of limbs. A physical barrier, which makes it awkward or impossible to use the 
previously preferred limb, induces immediate use of the non-preferred limb in rats and 
brush-tailed opossums whose initial preference has been already assessed (Peterson, 1934; 
Megirian et al., 1977). If the opening of a tube leading away from the animal is situated 
flush with the side wall, most mice use the limb on that side in the initial exposure to the 
problem, as opposed to the 50-50 distribution obtained with a tube opening ın the middle 
of the facing wall (Collins, 1975). 

The malleability of preference in response to the local geography suggests a strong role 
for environmental rather than constitutional factors in determining whether an individual 
animal will show left or right preference. However, several other lines of evidence imply 
that the simple environmental hypothesis that preference is due to random behavioural 
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factors plus cumulative learning 1s not a complete explanation. Collins (1975) tested mice 
initially with a tube on one side (the left, say) which produced a J-shaped frequency 
distribution — most mice favoured the limb on the same side as the tube opening but 
approximately 10 per cent of the populations resisted the environmental pressure, and had 
preferences of various strengths for the more awkward limb. The same mice were then, 
after two sessions in the initial apparatus (biased, for instance, to the left), tested with the 
tube opening on the opposite side (in this example, to the right). Now the distribution of 
preferences became almost U-shaped, with the modal category being mice who showed an 
immediate and very strong (over 96 per cent) preference for the paw consistent with the 
current apparatus. But roughly half the mice showed a preference of some degree for the 
limb which was used in the initial testing, which was now environmentally inappropriate. 
Collins (1968, 1977) showed that the mice who retained the initial preference when it was 
inappropriate had also demonstrated that preference more strongly when it matched the 
environment; whereas the mice which quickly switched to the easy response in the second 
apparatus, had shown lesser degrees of environmental influence towards the first direction. 

Although an explanation for this in terms of individual differences in rate of learning 
could probably be concocted, the most direct interpretation, as given by Collins, is that a 
distribution of handedness in mice 1s present before the first test. Right-handed mice tested 
in a ‘left-handed world’ resist the practical advantages of conforming to the environment 
by comparison with the higher preference scores of the left-handers, and eagerly switch to 
their natural limb when tested in the right-handed apparatus, whereas the left-handers with 
the experience of putting their favoured limb to good use in the first two sessions have a 
tendency to stick to their guns when subsequently tested in the right-handed apparatus. 

The resistance of animals to experimental variables designed to alter their demonstrated 
paw preference is most amply illustrated by the brush-tailed opossums studied by Megirian 
et al. (1977). Large cortical lesions, intended to destroy all sensory and motor areas 
contralateral to the preferred limb, failed to produce preference shifts in 13 out of 15 
animals, who persisted with their previously preferred limb, despite some qualitatively and 
quantitatively observed loss of efficiency. The cortical lesions did not seriously impair the 
ability to use the preferred limb. This may be due to greater subcortical control of 
movement in marsupials as opposed to rats which do tend to switch preference as a result 
of contralateral lesions of motor cortex (Peterson & McGiboney, 1951). However, almost 
total impairment of ability to move the preferred limb — by injections of local 
anaesthetic — did not result in any change of preference. All seven opossums tested for one 
session (of 1-2 h) after this intervention continued to make unsuccessful attempts to use the 
disabled limb instead of switching to the unaffected, but non-preferred, limb. 

Are marsupials, lacking the corpus callosum, quite unable to transfer skills from one side 
of the body to another? This 1s not the case: they are unwilling, rather than unable, to use 
the non-preferred limb after interference with the preferred limb by, cortical damage or 
peripheral anaesthesia. One is led to this conclusion by the performance of another group 
of brush-tailed oppossums (Meginan et al., 1977), who were persuaded to switch preference 
by the presence of a physical barner. The normal reaching practice was to squat on one 
side of the tube opening and insert down the tube the limb thus positioned in front of it. 
When a large block of wood was bolted to the inside of the cage, where the opossums 
normally squatted, 10 out of 12 subjects had, within 5 minutes, simply adopted the habit of 
squatting on the opposite side and using their non-preferred limb in a mirror image version 
of their initial performance. This indicates that the animals normally have the capacity to 
make use of their non-preferred limb if they perceive the necessity for it. But the forced 
practice induced in these 10 subjects did not change the original preference — when the 
barrier was removed they reverted to their initially preferred paw. The strength of the 
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initial preference, even if capacity in the non-preferred limb is available, is also indicated by 
the performance of the other two subjects in the group confronted with the physical barrier 
where they normally squatted. Instead of laterally reversing their stance, these animals 
employed other postural strategies which enabled continued use of the preferred paw; for 
example, an ‘up-down’ reversal of hanging upside down above the tube opening (Megirian 
et al., 1977). 

These results make it extremely unlikely that the limb preferences observed in these 
marsupials were learned during the testing period. Rather, the animals had a behavioural 
or constitutionally derived preference revealed by the test. It was noted above that some 
individuals in this species have a strong anatomical asymmetry in the thalamus (Haight & 
Neylon, 1978). 

It is possible that food-reaching preferences in rats are more subject to environmental 
influences than preferences in marsupials, since Peterson (1951) was apparently able to 
reverse initial preferences by forced practice with the non-preferred limb. However, only 
some individual rats showed the change in preferred limb, and the forced practice was very 
extensive (up to 1000 trials), so there may not necessarily be species differences between 
brush-tailed opossums and rats. 

Peterson (1951) assessed the influence of three experimental procedures on subsequent 
expressed limb preference. In one group, animals were allowed to eat from a dish having a 
flange which prevented use of the preferred limb. A second group fed from a symmetrical 
dish but had the preferred limb strapped to the body with adhesive tape. A third group 
lived through the testing period with their preferred forelimb strapped to the body, but 
were not exposed to the reaching apparatus (though they fed themselves as usual in the 
home cages). Animals in the first two groups, given specific training with the non-preferred 
limb, shifted their preference. Animals in the third group might be cornpared to the local 
anaesthesia condition of Megirian et al. (1977), since they experienced stiffness in the 
initially preferred limb when it was unbound, but showed no signs of switching to the more 
physically able, but less preferred, paw (Peterson, 1951). Like the brush-tailed opossums, 
rats may sometimes persevere with the use of the preferred paw in the presence of a barrier 
designed by the experimenters to discourage them from doing so (Megirian et al., 1974). 

It may be concluded that rats, mice and the brush-tailed opossum tend to show strong 
' consistent forelimb preferences for food reaching, which are not acquired during testing, 
and which are resistant to various anti-preference measures. In all these species the 
proportion of left-pawed animals 1s the same as the proportion of right-pawed, with very 
few ambidextrous individuals. : 


Rotation and other side preferences 


Rats can be made to turn around in circles by lesioning the subcortical basal ganglia of one 
hemisphere, or by unilateral lesions at other stages of the extra-pyramidal motor pathway 
(Crow, 1971). They always turn towards the side of the lesion, or towards the less active 
side of the nigrostriatal system, and this phenomenon has been very useful for 
psycho-pharmacological investigations since recovered lesioned animals will rotate in 
response to dopaminergic drugs (Christie & Crow, 1971). As mentioned in the previous - 
section on anatomy, there seems to be a slight pharmacological asymmetry of the 
nigrostriatal motor pathway in normal animals, which is enhanced by the administration of 
amphetamine. This sort of asymmetry 1n motor pathways would provide an example of a 
‘constitutional’ substrate underlying lateralization of motor behaviours, including forepaw 
reaching. There are no data on whether handedness in food reaching is correlated with 
physical asymmetries of motor pathways, but there is some indication that nigrostriatal 
asymmetries affect rotation, T-maze choices and bar-pressing (Glick et al., 1977). 


Lateralization in the vertebrate brain 345 


Intact rats will make frequent rotations (in the horizontal plane) in response to certain 
doses of amphetamine (especially if placed in a spherical enclosure), and individual animals 
exhibit a preference for one or other direction. Glick et al. (1977) believe that this 
preference reflects the underlying nigrostriatal asymmetry. Their view is supported by their 
finding that the direction of turning when rats are first placed in a spherical enclosure, or 
when activity therein is elicited by air puffs or shocks, is the same direction as that shown 
when rotation is due to a small dose of amphetamine. 

Glick et al. (1977) also report that side preference in a T-maze is related to higher 
concentrations of dopamine in the contralateral] striatum. Rats were shocked in the central 
arm of the maze, and allowed to escape into either the left or the nght arm. On 10 
consecutive trials, they showed side preferences that were stable from day to day and from 
week to week. Subsequent bilateral assays of dopamine showed a signicantly higher mean 
level of this substance in the striatum contralateral to the preferred direction of turn in the 
T-maze. 

Side preferences (‘position habits’) are ubiquitous and usually an annoyance to be 
eliminated whenever animals are required to make spatial choices; but it is unlikely that 
nigrostriatal asymmetry is the only variable involved. For instance, rats allowed to press 
either a lever to the left of water location or a lever to the right to produce intermittent 
water rewards established strong preferences for the right or left lever; but videotape 
observation revealed that while about 75 per cent of the animals pressed the night lever 
with their right paw, or the left with their left, a quarter of them tried to press the nght 
lever with the left paw or the left lever with the right paw. This makes assignment of 
laterality problematical (Glick, 1973; Glick & Jerussi, 1974). In these experiments, 
amphetamine (which, on the basis of the rotation results, increases nigrostriatal asymmetry) 
tended to shift lever preference towards paw preference (for example, right-pawed rats 
initially preferring to press the left lever shifted their preference to the right lever when 
injected with amphetamine). This probably means that paw choice is more closely tied to 
nigrostriatal factors than lever choice and that position habits for lever pressing are 
influenced by other variables. 

It 1s at present a matter for conjecture, but these pharmacological and anatomical 
variables investigated initially in the context of rotation may eventually add substance to 
the claim of Collins (1977) and others that individual rodents are constitutionally left- or 
nght-handed. 

It is also useful to emphasize that motor asymmetries are not necessarily tied to 
manipulation. Asymmetries in locomotion, or in other species-specific activities, may be 
more interesting in some contexts (for instance, in relation to genetic control) than food 
reaching. Some species of Baleen whales, though lacking the skeletal distortions of the 
other suborder, may have skewed distributions of side preferences. Whales often swim on 
their sides near the bottom, or when feeding, and asymmetnes of colouring (Fin whales) or 
in acquired barnacle encrustations (Gray whales), as well as direct observation, suggest that 
a large majority of individuals in some species swim more often with the right side down 
(Mathews, 1978). It used to be thought that there was a spiral component in whale tail 
movements, and Thompson (1942) believed that counter-torque at the head, during 
growth, produced Cetacean skull asymmetnes in general, and the rifling of the Narwhal’s 
tusk in particular, but this theory has not been supported by subsequent observations 
(Mathews, 1978). 

It seems generally agreed that in another large mammalian species, the African elephant, 
particular animals have a preference for one of the tusks, which is consequently known as 
the ‘servant’ tusk (Sikes, 1971), but reports that the right tusk 1s generally more worn than 
the left (e.g. Shortridge, 1934) are probably not reliable. The use of asymmetrical gaits by 
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horses is well known, as is the preference of some individual racehorses for turns in one 
direction. The use of asymmetrica] gaits ıs extremely widespread in mammals (and also 
occurs in amphibians and reptiles), but individuals typically switch from right-lead to 
left-lead versions as circumstances require. Very little is known about the frequency of 
individual preferences, or the distribution of such preferences within or between species (see 
Hildebrand, 1977, for a review). 

Apart from locomotion, another category of behaviour where some species asymmetries 
are known, although information is sparse, is courtship and mating. Genital asymmetries 
are not uncommon in insects (Neville, 1976) and in the cephalopods, molluscs which may 
be counted honorary vertebrates on the grounds of brain development, mating is usually 
dextral. In the common octopus, for instance, the third arm on the right is modified as a 
sperm-transferring organ in the male (Wells, 1978). In the vertebrates, a similar degree of 
anatomically unambiguous asymmetry in copulation is probably confined to fish, but 
lopsided behaviour may be found in reptiles and mammals. One of the most celebrated 
vertebrate asymmetries occurs in the four-eyed fish (Anableps anableps) in which the large 
intromittent organ of the male is angled either to the right or to the left, and so is the 
female's genital opening, so that dextral males must mate with sinistral females, and so on. 
Garman (1896) claimed that there was a small preponderance of dextral males and sinistral 
females in his sample, so that the remaining (40 per cent) sinistral males and dextral 
females would have been at a disadvantage, but his observations do not seem to have been 
repeated. A related genus (Jenynsiids) has dextral and sinistral forms for both sexes, and a 
different order (Phallostethiformes) with another sort of copulatory organ has dextral and 
sinistral males, but bilaterally symmetrical females (Breder & Rosen, 1966). However, even 
more interesting 1n terms of genetic and non-genetic theories of asymmetries are species in 
which all or most individuals have the same laterality. The family of live-bearers 
(Poeciliidae — in the same order as Anableps) includes many such species, and a detailed 
anatomical survey of this family has been provided by Rosen & Bailey (1963). In the main 
subfamily, 136 species were grouped in 19 genera, with species-wide asymmetries of the 
male sexual organ characteristic of five genera. Of these five, two genera contain only 
sinistral species; one genus is composed of dextral species; 1n one genus species are either 
dextral or sinistral; and in the last genus species may be either sinistral, dextral or 
symmetrical. Rosen & Tucker (1961) observed sexual behaviour in species from 13 of the 
genera. (The family is of small fish, suited to the aquarium — one of the symmetrical genera 
includes the guppy.) Not surprisingly, they found that 100 per cent of the erections seen in 
an anatomically sinistral species were angled to the left. However, it was also found that a 
species with only slight physical sinistrality had 90 per cent of its erections to the left side, 
and a symmetrical species had most erections pointing to one side or the other, in equal 
numbers over the species, but with individuals showing right or left biases, confirming the 
report by Aronson & Clark (1952). This emphasizes that anatomical bilateral symmetry by 
no means precludes unilateral behaviours, although in this case no species-wide 
behavioural asymmetries were observed in physically symmetrical species. Another general 
point is that side-to-side pairings are inherently asymmetrical: although left or right choices 
will probably be made at random in the vast majority of species, it is conceivable that 
behavioural sinistrality or dextrality occurs in species of fish without the anatomical 
predestination which takes place in these South American live-bearers. 

Several thousand species of lower vertebrates have a similar kind of unilateral choice. In 
sharks and rays, and lizards and snakes, the male is equipped with two intromittant 
organs, both left and right, although the female possesses only one relevant orifice. Very 
little indeed 1s known about the deployment of the sexual anatomy in any of these species 
(Breder & Rosen, 1966; Carpenter & Ferguson, 1977), but in general it 1s clear that l 
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individual copulations must be either dextral or sinistral, although some species of shark 
are reputed to accomplish simultaneous insertion of both ‘claspers’. It may be that the 
anatomical duplication invariably indicates an adaption for the purpose of individual 
ambidexterity at a sideways task, but the possibility of unilateral preferences, in reptiles at 
least, 1s suggested by the peculiar behaviour of the Virginia opossum. This animal is a 
marsupial mammal considered not far removed from its reptilian ancestry. (It is sometimes 
called a ‘living fossil’, but should not be regarded as necessarily representative of any 
larger group — Clemens, 1977. It is not closely related to the brush-tailed opossum.) Some 
authorities have claimed that the Virginia opossum can conceive only while lying on its 
right side (e.g. Reynolds, 1952). A more considered view is that ‘It is obvious that the right 
side is the preferred one for the species and left-side copulation could reduce the probability 
of fertilization' (Hunsaker & Shupe, 1977, p. 288). Left-sided matings have been observed, 
but the more frequent practice appears to be for the male opossum to mount the female 
from the rear, whereupon the pair falls to the right and intromission then takes place. In all 
marsupials (except kangaroos), semen must traverse the left and/or right lateral vaginas of 
the female, and the penis of the male is bifurcated (Sharman, 1970). Some anatomical 
advantage for the right-side channel, which would not be inconsistent with the individual 
variations observed by Ratcliffe (1941), might render asymmetrical behaviours worth while 
in the Virginia opossum and close relatives. In general, reproductive behaviour in mammals 
is bilaterally symmetrical. The urogenital system of vertebrates, like other internal organs, 
may frequently be asymmetrical, without leading to overt biases in behaviour. Although 
birds and lower vertebrates have more pronounced reproductive asymmetries than 
mammals (only the left oviduct is usually functional in birds) slight genital inequalities also 
occur in mammals. Implantation in only the right horn of the uterus occurs in some species 
of ungulates and rodents as well as in several species of bat (Adsell, 1966). There is a 
curious asymmetry in the reproductive organs of domestic even-toed ungulates, especially 
pigs and cattle, in that the penis shows a pronounced spiral deviation, like that of a 
left-handed corkscrew, with corresponding spiral ridges in the cervix of the female. Semen 
1s ejected to the left and there are various asymmetries within the penis (Ashdown et al., 
1968; Hafez, 1974). 

It is extremely unlikely that these reproductive asymmetries are directly related to human 
handedness, but they are important to consider in the light of genetic and embryological 
theories of animal asymmetry in biology as a whole (Corballis & Morgan, 1978). In the 
cases of live-bearing fish and the Virginia opossum, it is possible that behavioural 
preferences would be reflected by differential effects of left- and right-hemispheric lesions on 
reproductive activity. Corballis & Morgan (1978) put forward the hypothesis that there 1s a 
universal vertebrate tendency for the left side of the body to develop faster than the right, 
which needs numerous modifications to account for the lack of consistency, between and 
within species, in manifestations of the universal tendency. Variations 1n direction of 
lateralization in the reproductive system, between both unrelated and closely related 
vertebrate species, suggest that any universal left-leading tendency allows equally well for 
sinistral, dextral and bilateral structures. 


Forepaw preferences in cats 


Domestic cats often use the forepaws one at a time — to pat a ball of paper in play, or to 
make a strike in more serious hunting — and pieces of food transfixed on one paw may be 
taken directly into the mouth. Experimental assessment of paw preferences in cats has been 
performed with methods similar to those used with other animals — variations on the theme 
of reaching for static items of food. Cole (1955) observed 60 cats reaching for food placed 
in a transparent tube fixed on the floor of the cage. Using a criterion of 75/100 reaches with 
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the same paw, only 20 per cent were right-handed, 38:3 per cent were left-handed and 41-7 
per cent were ambidextrous. The proponderance of left-handers has not been confirmed by 
others (Forward et al., 1962; Warren et al., 1967) but the high proportion of ambidextrous 
cats appears to be a reliable finding: Warren et al. (1967) tested 34 young cats and of these 
48 per cent were ambidextrous using the 75 per cent criterion. This ıs for reaching out of a 
*handedness box' through a 2 in gap between the floor and the bottom edge of a glass 
front. There was no difference between the preference scores of animals given over 1000 
trials in the box between 2 and 5 months of age, and animals tested for the first time at 5 
months, which suggests that early learning was not a factor. The same 34 cats were given 
five more reaching tests by Warren et al. (1967) immediately after the handedness box test. 
These involved reaching through vertical bars at the front of a large cage for a piece of 
meat which was either in a food well, or covered by a wooden block which could be 
pushed aside, or in a trough, or in a glass tube, or under a large block that had to be 
pushed away. Seventeen of the 34 cats preferred the same paw in all the reaching tasks (50 
per cent) and, of these, 15 had consistent significant preferences (44 per cent). 

It seems reasonable to say that about half the cats had paw preferences, whereas the 
other half did not. This would make cats considerably more ambidextrous than the rats, 
mice or Australian opossums discussed above. There is no reason why cats should not 
differ from rodents or marsupials on degree of laterality of food reaching. Although the 
apparent differences may be partly'a matter of treatment of data, 62 per cent of the mice 
studied by Collins (1977) made at least 96 per cent of their reaches with the same limb, 
while only 38 per cent of Warren et al.’s kittens reached a similar criterion in the 
handedness box 


Hand preferences in primates 


Use of the forelimbs for gathering and inspection of food and for bringing it to the mouth 
is typical of primates from bush-babies to man (Jolly, 1972), although all species also make 
some use of the hands in locomotion. Small and primitive primates such as the bush-baby 
normally use their hands for catching insects (a task requiring a considerable degree of 
accuracy and speed), while the anthropoid apes have a well-developed precision grip 
between the side of the thumb and index finger (Napier, 1961). However, data on 
handedness in non-human primates is extremely limited and consists largely of studies of 
food reaching ın the rhesus monkey. 

Much of the work on food reaching. in rhesus monkeys is due to Warren (Warren, 1953, 
1958, 1977; Warren et al., 1967). The view expressed by Warren et al. (1967) was that 
‘rhesus monkeys have lateral preferences that are relatively strong, moderately stable over 
many months, and rather resistant to change by training’, with the implication that the 
preferences were homologous to human handedness. More recently, Warren (1977) has 
explicitly recanted this implication and emphasized the dissimilarities between monkey 
hand-preferences and human handedness. 

The character of the evidence for lateral preferences in monkeys is relatively 
straightforward, if the theoretical implications are not: food-reaching hand preferences in 
rhesus monkeys are roughly similar to those observed in rodents. The method for assessing 
handedness 1s also similar, except that monkeys pick up small items of food with one hand 
under more varied conditions. 

Warren et al. (1967) and Warren (1977) discuss the results from a number of variations 
of the food-reaching test using a Wisconsin General Test Apparatus (Harlow, 1949). A 
peanut or raisin was presented, either on the surface of the food-tray in front of the 
monkey's cage, or at the end of a wooden trough extended away from thé animal, or in a 
horizontal tube, or in a vertical tube. In an additional test, cereal rings were presented on 
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horizontal wires in order to observe lateral movements — but the monkeys usually just 
broke the food off the wire. 

Typical performance in the WGTA requires object displacement — the pushing away of a 
card or block which covers a food-well. Usually monkeys push the covering object with 
one hand and pick up the peanut with the other, and therefore the object displacement 
qualifies as a bimanual task. The Warren studies employed three object-displacement tests 
(with a tin, a card and a wooden block as food covers), and two further bimanual tests: 
pulling in a raisin on the end of a light chain and dragging towards the cage (by a handle) 
a small wooden box placed far enough away to make this necessary if the animal was to 
get at the reward contained in it. 

Fourteen monkeys were tested on the entire series of tasks (200 trials on each test) and 
were retested twice 2 years later. Taking mean preference scores across all tests, the 
monkeys ranged evenly from strongly left-handed to strongly right-handed, and there was a 
significant correlation between the overall preference scores obtained 2 years apart 
(rho = 0-85). This was the basis for the conclusion of Warren et al. (1967) that stable and 
consistent hand preferences were observed, although, as with other animals, there was no 
indication of right- or left-handedness in the sample as a whole. The misgivings expressed 
by Warren (1977) arose from more detailed analysis of the same data, which showed that 
the consistency of the total preference scores was not characteristic of each individual test. 
There appeared to be three categories of test. The object displacement tests (requiring the 
pushing away of a tin, card or block from the food-well) not surprisingly correlated with 
each other very well at all stages. Four of the food-reaching tests (from the surface, vertical 
tube, trough, or box) were initially not intercorrelated, but at the two retests were 
correlated with each other and the object displacement tasks. Finally, reaching for food 
from the wire or horizontal tube, or from the end of the chain, and pulling at the chain or 
box, did not correlate well with each other or with the other tasks at any stage. 

Warren (1977) interprets this variability between tasks as meaning that hand preferences 
in monkeys is task-specific, and strongly affected by experience and practice, and therefore 
not revealing of organismic or constitutional asymmetries, and not homologous to human 
handedness. Another study of 171 rhesus monkeys' simple food reaching from the home 
cage found that consistency and strength of hand preference increased considerably over 
only 3 days of testing (with 100 reaches twice each day — Lehman, 1978). An aspect of task 
specificity in Warren's data is that the bimanual object displacement tasks gave more 
consistent preference scores than the unimanual reaching response. Beck & Barton (1972) 
also found that a much higher proportion of monkeys (in this case, stump-tail macaques) 
had stronger preferences for bimanual than for single-handed tasks. In their experiments 
the bimanual tasks were more demanding than object displacement - they included holding 
open a spring-loaded drawer with one hand while taking out a raisin with the other, and 
undoing a series of latches in order to open a box. Task difficulty in general, or 
complementary asymmetrical use of both hands in particular, may increase the degree of 
hand preference exhibited by non-human primates. 

There are few systematic data on hand preferences in apes, but it is unlikely that 
handedness is any more significant 1n the behaviour of wild apes than it is in 
monkeys — neither the observations of Schaller (1963) of gorillas in their natural habitat, 
nor those of Goodall (1965) on chimpanzees, revealed strong hand preferences for food 
reaching in groups or in individuals. Goodall (1965) states that in the much cited 
chimpanzee tool-using behaviour — ‘fishing’ for termites with a piece of grass — either hand 
may be used. Schaller reported that of 72 male gorillas observed chest-beating, 59 (82 per 
cent) began with their right hands, but does not report strong limb preferences for the 
associated unilateral displays of leg kicking and vegetation throwing. 
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Relation to human handedness 


One extremely clear generalization emerges from the data on lateralization of motor 
performance in animals, and it is as well to assert it before considering other aspects of the 
data which are not at all clear. None of the mammalian species in which forepaw 
preferences have been systematicaly. studied shows a species-wide preference for one or 
other forelimb. If there is an innate preference for use of the right forelimb in the human 
species, it does not seem to be foreshadowed in other mammals, with the possible exception 
of gorillas. Since it is the species preference for the right which seems to be of most interest 
in human handedness, further discussion may be superfluous. However, the suggestion by 
Annett (1970) that human hand preferences should be considered as a distribution skewed 
to the right gives some point to the question of whether mammalian handedness differs 
from human handedness only in the greater mammalian proportion of left-handers. 

Rats, mice, brush-tailed opossums and macaque monkeys, as individuals, may have 
strong and consistent hand preferences for at least one motor response — that of reaching 
for food; and in monkeys the same hand may be preferred for several forms of object 
manipulation. At a superficial level these individuals show lateralization of forelimb 
control - the difficult questions concern what is responsible for these paw preferences and 
for human hand preferences. The data suggest that such forepaw preferences as do arise in 
rodents are caused by constitutional factors rather than chance reinforcements. The work 
of Glick er al. (1977) suggests that physical asymmetries between the two sides of the 
extra-pyramidial motor system may account for some of the behavioural variance. The 
usual assumption 1s that asymmetries of the motor cortex, or pyramidal pathway from it 
(Cole, 1955), are responsible for behavioural asymmetries in higher mammals, but it has yet 
to be shown that either rodent or human handedness is caused by constitutional 
asymmetries 1n motor pathway anatomy. 

It is unlikely that manual preferences will be explained purely in terms of anatomical 
asymmetries, even in animals, because when more than one task is observed in rats 
(Peterson, 1934) or in monkeys (Warren, 1977) direction of preference and its strength and 
consistency is seen to vary with the task. Warren seems to think that this makes the 
preferences non-human. This would be so if human hand preferences were entirely 
consistent from task to task, and immune to practice effects. But we know, if we do not 
always acknowledge, that human hand preferences are not typically consistent from task to 
task (Annett, 1970, 1972). We know less about how practice affects hand preference in 
human skills, but all skills considered for a human adult are likely to be overpractised by 
comparison with all specific animal tests, with handwriting the most practised of all. 

In order to make more accurate comparisons between human and animal hand 
preferences, it would be useful to have different kinds of data in both cases. For humans 
more data would be helpful on food reaching responses for adults as well as infants 
(reaching for a biscuit, eating an apple, or drinking a cup of tea may very well be less 
strongly lateralized, on initial testing, than tool-using skills). More valuable would be data 
for animals on any other task apart from food reaching. If a monkey (or a rat) is taught a 
difficult asymmetrical task, and given considerable overtraining on it, how well does the 
task transfer if laterally reversed? Pressing a lever to within fine limits of angular 
displacement with one limb would be a straightforward test, and more elaborate tasks such 
as those used by Beck & Barton (1972) might be revealing. There seems to be little 
discussion of how well humans may transfer one-handed skills when forced to do so, but 
one would expect some transfer effects, visible for instance in the relearning of handwriting 
by adult amputees. 

At present no firm conclusions can be drawn, apart from the truism that both 
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constitutional and environmental factors probably affect hand preferences in animals and 
man. Perhaps the most salutary implication that could be drawn from the study of animal 
limb preferences is that hand or paw preference in a particular task should not be taken as 
an indubitable sign that hemispheric dominance 1s being manifested. 


3.2 Vocalization 


Although vocalization in man undoubtedly serves higher purposes than it does in other 
animals, the human vocal apparatus (lung, larynx, tongue, and so on) shares many 
features, including the forms of muscular control and innervation, with other mammals, 
and some general features with birds. While cerebral dominance for language may have 
occurred because of the special nature of human vocalization, alternative hypotheses (for 
instance, that species-specific vocalizations are controlled by only one side of the brain in 
all vertebrates) need to be eliminated, if lateralization of control of human language is to 
have a unique status. 

There is in fact very little evidence appropriate for deciding whether species-specific 
vocalization is under bilateral or unilateral control in species other than man; but it is 
possible to make a claim, in the cause of provocation if not strict accuracy, that production 
or reception of species-specific vocalizations is controlled by the left, but not the right, 
cerebral hemisphere, in all vertebrate species for which appropriate evidence is available. 
These include, unfortunately, only three species: the canary, the Japanese macaque 
monkey, and man. Left-hemisphere control of human speech has been discussed in a 
previous section, and it therefore remains to consider only the data for the other two 


species. 


Left-hemisphere dominance of song in the canary 


Singing by male canaries is severely disrupted by lesions to some areas of the left cerebral 
hemisphere, but much less affected by lesions to the same parts of the right hemisphere 
(Nottebohm et al., 1976; Nottebohm, 1977). Domesticated strains of the canary have been 
selectively bred for elaborate and reliable singing (by the males only), so that individual 
birds produce as many as 30 or 40 separate ‘syllables’, identifiable as distinct patterns on a 
sound spectrogram. Although the basic pattern of canary song emerges in a bird never 
allowed to hear another canary, infant birds normally imitate adults if they can hear them 
and make use of auditory feedback from their own efforts (Waser & Marler, 1977). The 
terminal auditory projection in birds, analogous to primary auditory cortex in mammals, 
occurs in a particular area of the neostriatal layer of the hemispheres. Nottebohm et al. 
(1976) found that the most severe disruptions of song occurred in canaries lesioned just 
above and behind this auditory projection, in a region of the hyperstriatum ventrale 
(although unilateral lesions of the auditory area itself had no immediate effects on singing). 
There appears to be hemispheric dominance of song control in the canary, since lesions of 
this ‘song control centre’ in only the left hemisphere reduced singing vocabulary from an 
average of 24 syllables to an average of less than one syllable in the second week after the 
operation, while similar lesions of the right hemisphere allowed retention of more than half 
(13) of the pre-lesion average (24 again) of syllables. This result was obtained using four 
birds with left-hemisphere lesions, and five birds with right-hemisphere lesions, and needs 
replication, but other evidence for left-sided lateralization of song control comes from more 
peripheral interruption of the song-control pathways. 

Many of the details of the acoustics of how birds produce sounds remain to be settled, 
but the relevant point here is that, although the larynx is used to control air flow, birds use 
a specialized sound source, the syrinx, which is positioned at the junction of the two 
bronchial passages at the bottom of the trachea. In the canary, the muscles on the left side 
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of the syrinx are larger than those on the right (Nottebohm & Nottebohm, 1976). The 
main innervation of the syrinx is via branches of the hypoglossal (XII) cranial nerve (the 
left and right tracheosyringealis nerves). The motor control of the syrinx appears to be by 
an entirely ipsilateral pathway: from the left hyperstriatal centre to another telencephalic 
structure (in the archistriatum) and from there to an ipsilateral brain-stem nucleus and the 
hypoglossal nerve, and then to the left side syringeal muscles. When the left hypoglossal 
connection to the syrinx is severed, canaries can produce no complete syllables. The birds 
go through the motions of singing, but only faint and hoarse clicks or subsyllabic elements 
emerge. On the other hand, severing the right hypoglossal innervation of the syrinx may 
leave song entirely unaffected, and on average only a tenth of the song-syllables are 
modified or eliminated. This left hypoglossal dominance has been observed without any 
exceptions in a large number of canaries, and in chaffinches (Nottebohm, 1971) and in a 
species of sparrow (Lemon, 1973). It is, therefore, quite possible that left hypoglossal 
dominance occurs in large numbers (if not all) of passerine species, and left hemispheric 
dominance may be similarly widespread, although it has been directly tested only in the 
canary. 

It is certain, however, that not all bird species exhibit left hypoglossal dominance. In at 
least one species of parrot each hypoglossal innervates both halves of the syrinx 
(Nottebohm, 1976). Cutting either left or right hypoglossal connections to the syrinx 
produces only very minor changes in vocalization because both sides of the syrinx are 
adequately served by the remaining innervation. This does not necessarily preclude 
hemispheric dominance: unilateral damage to the nerves controlling the human larynx 
produces the same degree of hoarseness whether it is on the left or right side, and the 
remaining innervation will allow good recovery (Greene, 1964). 

The lateralization of vocal control as it apparently occurs in the canary is more 
thoroughgoing than the lateralization of human speech, in that, in the canary, the left 
hemisphere seems to control only the left effectors, while in man, each hemisphere projects 
bilaterally to the brainstem nuclei of most of the cranial nerves, including those governing 
the larynx and tongue (Espir & Rose, 1970). Is it the case that left-hemisphere control of 
vocalization in the canary is firmly built in and unalterable? Two aspects of Nottebohm's 
experiments suggest that the right hemisphere may take over control of vocalization after. 
early left-hemisphere damage. The birds which received left-hemisphere lesions which 
severely impaired song did so when they were 12 months old, when ontogenetic 
development of song was virtually complete. After the immediate post-operative loss of 
song, a considerable degree of song recovery took place, so that by 7 months after the 
operation some of the birds had almost as many syllables as before the operation. Now 
they were subjected to sectioning of the right hypoglossal connection to the syrinx, and lost 
most or all of the ‘recovered’ syllables. This implies that the right hemisphere, rather than 
remaining tissue in the left hemisphere, had been responsible for song recovery (Nottebohm 
et al., 1976). It 1s also likely that a right-hemisphere takeover of song can be induced by 
neonatal damage to the left hypoglossal, since if this nerve is cut in 2-week-old birds, left- 
hemisphere lesions in the same birds when an adult has a reduced effect (Nottebohm, 1977). 

The canary thus seems to share with man a left-hemisphere dominance of vocalization 
combined with a right-hemisphere capacity for vocal control usually revealed only when 
the left hemisphere is damaged. Children without.the use of the left hemisphere show 
reasonable competence for language (Lenneberg, 1967) and Kimsbourne (1971) found that 
adult left-hemisphere stroke patients who recovered from aphasia lost their recovered 
ability if the right hemisphere was disabled by sodium amytal injections. Some restitution 
of speech has been observed in adult patients after excision of the entire left hemisphere 
(Smith, 1978). ' 
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Left-hemisphere dominance of species-specific cry reception in the Japanese macaque monkey 


It is claimed in a recent report that ‘Japanese macaques engage left-hemisphere processors 
for the analysis of communicatively significant sounds that are analogous to the lateralized 
mechanisms used by humans listening to speech’ (Peterson et al., 1978, p. 324). If this 
claim is correct it provides.something of a landmark in the search for subhuman cerebral 
dominance. The evidence for it is as follows. 

Japanese macaques make a number of ‘coo’ sounds in the context of friendly social 
behaviour. Fiéld recordings of two of these subtypes of vocal signal were the stimuli tested 
in the Petersen et al. (1978) investigation. Functionally, one type of ‘coo’ is used by 
females when soliciting males for sexual purposes, and the other type seems to be a more 
general contact-seeking call. Acoustically, the soliciting ‘coo’ has a smooth late peak, and 
the general ‘coo’ a smooth early peak; both types occur with a range of fundamental 
frequencies. Discrimination of recordings of eight late-peak and seven early-peak calls, 
presented via headphones to either the left or right ear, was the experimental technique 
used to assess cerebral dominance. The animals were required to squeeze a tube to initiate 
a series of late-peak calls, played into either the left or the right ear in a randomly 
alternating sequence. The task was to hold the tube until the series of late-peak calls was 
interrupted by an early-peak call — if the monkey released the tube at this point it received 
a food reward. ` ' 

The performance of five Japanese macaques, and five other Old World monkeys, was 
assessed to see if the early-peak stimuli were detected more successfully with one ear than 
with the other. The number of exemplars of each call-type included per session was 
gradually increased during the training of individual monkeys until the animal could 
successfully discriminate when all 15 stimuli were included on the same day. For each 
training session, each of the early-peak signals that had been included could be classified as 
showing left-ear advantage, right-ear advantage, or no advantage, on the basis of 
percentage correct scores achieved after reception at the individual ears. At the end of 
training, particular monkeys could be assessed according to the proportion of right-ear 
advantage instances thus accumulated. All five of the Japanese macaques had higher 
proportions of right-ear advantages than-should have occurred by chance, although the 
‘superiority was not overwhelming — on average only 60 per cent of the individual signal 
tests showed better detection by the right ear. However, only one of the other five monkeys 
demonstrated a significant right-ear advantage. Two of these others and two of the 
Japanese macaques were then trained to sort the same field-recorded signals by pitch. The 
task was to detect a high-pitched call, irrespective of its functional type, in a series of 
low-pitched calls. Both the Japanese animals showed a reduced right-ear effect, in one case 
amounting to a significant proportion of left-ear successes, and the other two animals again 
did better with the left ear as often as with the right. Although samples, of this size are 
hardly conclusive, it would be expected that any left-hemisphere dominance effect should 
occur with complex and socially significant discrimination sounds rather than pitch. 

What may be concluded from the differences in performance which resulted from playing 
the early peak ‘coo’ into the right, rather than the left ear of these five Japanese monkeys? 
An analogous finding would almost certainly be interpreted as demonstrating 
left-hemisphere dominance, in human subjects. An argument may be made that, in the 
monkeys, the left-hemisphere was exclusively responsible for detecting early peak ‘coos’. 
For roughly 60 per cent of the samples, when the right ear was better than the left, it would 
be assumed that the right ear had better access to the left hemisphere than the left ear. This 
is in accordance with the anatomical and electrophysiological data. On 40 per cent of the 
samples, when the left ear was performing more accurately than the right, one would have 
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to assume that attentional factors had temporarily overcome the built-in right ear ease of 
access of the left hemisphere. Since sounds were randomly alternated between left and right 
ears, this is not implausible. 

Catlin et al. (1976) performed a rather similar experiment to that of Peterson et al. (1978) 
but used human subjects, and found a very small right-ear advantage in reaction time to a 
target speech sound: differences between the ears when sounds are presented monaurally 
are usually marginal, and dichotic presentation of certain kinds of stimuli is needed to 
obtain a more reliable right-ear advantage (Darwin, 1974). Dichotic presentation was not 
used by Peterson et al. as their research program was not primarily concerned with ear 
differences. Thus although the magnitude of the effects obtained in their experiments was 
not great, it is entirely consistent with a left dominance of conspecific call reception, of the 
same order as that observed in man. 


Left-hemisphere dominance of animal vocalization. Conclusions 


Direct evidence for a greater involvement of the left hemisphere in production or reception 
of vocalization in a non-human species is limited to data on singing in four 
left-hemisphere-lesioned canaries (Nottebohm, 1977, 1979). Further experiments comparing 
the effect of left- and right-hemisphere lesions on singing in canaries are needed to confirm 
Nottebohm's results, and since there is as yet no evidence of this sort for hemispheric 
dominance of song in avian species other than the canary (Nottebohm, 1979), it is too early 
to say whether its occurrence in this bird is an isolated peculiarity, a result of selective 
breeding, or an example of a widespread avian (or vertebrate) phenomenon. 

It seems likely, however, that a peripheral asymmetry in vocalization as assessed by the 
differential effects of left and right denervation of the syrinx is characteristic of several 
seed-eating song-birds. It is conceivable therefore that hemispheric dominance, which is 
associated with left syringeal dominance in the canary, occurs in a number of other 
passerine species. Both the syringeal and the cerebral dominance effects have been 
measured exclusively for song production, rather than song reception, or response to vocal 
signals and calls more generally. In order to make further comparisons of cerebral 
dominance in birds and man it would be useful to discover whether there is any dominance 
for song reception. There are several ways to assess species-specific responses to song: for 
instance, captive female cowbirds will adopt certain postures on hearing recordings of male 
songs, and this is quantifiable (King & West, 1977; West et al., 1979). If it is confirmed 
that the left hemisphere is dominant for the production of male canary song, it would be 
interesting to know whether or not the female canary's left hemisphere is dominant for 
receiving it. Apart from examining possible cerebral dominance via species-specific 
responses to species-specific acoustic signals, more conventional laboratory techniques for 
discrimination learning might be used — one wonders whether the procedure utilized by 
Petersen et al. (1978) would reveal a right-ear advantage for detecting complex sounds in 
the canary. Conversely, one may ask if Japanese monkeys have left-hemisphere dominance 
of vocal production. If Petersen et al. (1978) are correct in inferring from their data that 
their monkeys' receptive mechanisms were lateralized in a manner analogous to those used 
by humans to analyse speech, the strength of the analogy must be tested by whether or not 
unilateral lesions of frontal cortex can be found to reveal left dominance of vocal 
production. Techniques for such experiments are difficult, but impairment of vocalization 
by bilateral lesions of frontal limbic regions has been demonstrated (Sutton et al., 1974). 
Only bilateral removal of the cingulate and subcallosal gyrus was tested in two animals. 
Neither unilateral nor bilateral lesions to frontal or parietal areas selected as homologs to 
human speech areas had any effect on the vocalization tested, but this was a prolonged call 
emitted by a restrained and isolated animal for automatic food rewards. Evaluation of the 
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social vocabulary after unilateral lesions would be preferable. In man, left dominance of 
vocal production appears to be stronger than dominance of comprehension, but, on the 
other hand, left dominance may be less strong or even reversed for emotional and 
instinctive cries (Gardner, 1977). However, it would be expected that unilateral 
left-hemisphere rather than right-hemisphere lesions of secondary auditory cortex would 
produce pronounced impairment of the call-type discrimination tested in Petersen et al.’s 
ear-advantage experiment. No such difference was observed in the experiment of Hupfer et 
al (1977) but only two animals appear to have been given appreciable unilateral lesions on 
the left. 

Until more evidence is available, cerebral dominance of animal vocalization remains an 
interesting possibility rather than a confirmed fact. However, data on dominance and 
vocalization as it stands are qualitatively different from those for handedness ın animals. 
Animal hand preferences are equitably distributed in the species — giving roughly equal 
numbers of left- and right-handers. It might have been expected, therefore, that any 
asymmetry in cerebral control of animal vocalization would follow the same pattern, with 
equal numbers of animals showing right dominance and left dominance. It appears that 
this ıs not the case — the dominance of the left part of the syrinx in canaries is almost 
without exception, and the small sample of Japanese macaques all have right ear 
advantages (i.e. left-hemisphere dominance). It is slightly absurd to make inductions from 
only three species, but if left dominance holds across these species, explanations of why ıt is 
the left rather than the right hemisphere which dominates speech in man may need a more 
general framework. 

Apart from the right or left question, a general explanation will be needed if many 
species show dominance of vocalization by either hemisphere, whether it is the left or the 
right, or if individuals within species have dominance by one or other hemisphere. A 
division of labour between hemispheres may be as advantageous to small-brained species as 
it is to ourselves, but if cerebral dominance of species-specific vocalization occurs with any 
regularity at all in non-human species, and even if it occurs only 1n canaries and macaques, 
then the *doubling of cognitive capacity' consideration (Levy, 1977) will be less apt. 

Even the present hints that cerebral dominance in production and reception of sounds 
may not be a unique human characteristic prompt one to ask why bilateral control of 
sound production should be retained if it 1s not strictly necessary. A factor which 
distinguishes the organs of vocal production in most vertebrates from, for instance, the 
limbs, is that it is not particularly useful to control each half of the vocal organs 
independently — the two sides of the body might just as well be operated in parallel for 
voca] functions, although this would certainly not be true for locomotion. Should we be 
surprised, then, if any vertebrate species accomplishes vocalization with one hemisphere 
active and the other lazy? The same argument might of course apply to other oral 
movements apart from those connected with vocalization: left-hemisphere dominance for 
non-verbal oral movements may occur in man (Mateer, 1978), and there is certainly 
left-side dominant innervation of the oral region in the lanclet (Amphioxus) sometimes 
considered as a vertebrate prototype (Kappers et al., 1936), but little is known of possible 
asymmetries of oral control in other vertebrate species in between. In the light of the 
‘immense left-sided larval mouth’ of Amphioxus (Young, 1962, p. 45) and the left-sided 
respiration of the very earliest chordates (ancestors of vertebrates — Jeffries, 1975), left-sided 
dominance of speech may be one of the.most conservative of human features. 
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3.3 Perception 
Hearing 


A ‘least effort’ hypothesis may be applied to the reception of sounds. Because of the 
organization of the auditory pathways, the same auditory information can be analysed 
twice — once in each hemisphere. It would be odd for either hemisphere to ignore 
fundamental characteristics of auditory input — such as pitch and intensity — but analysis of , 
complex sounds, especially involving integration and comparisons over time, could 
reasonably become optional, with only one of the hemispheres organized to do it. 

It might be the case that hemispheric differences occur for the reception of 
species-specific cries, as discussed in the last section, in the absence of more general 
asymmetries in audition. On the other hand, somewhat broader categories of function, such 
as auditory short-term memory, or sequential analysis of complex sounds, rather than 
simply recognition of a collection of innately programmed signals, may characterize a 
hemisphere advantage. Two sorts of experiment in which differential effects of left- and 
right-hemisphere lesions on auditory performance seem to have been found may be quoted: 
experiments testing short-term memory for audio-visual associations in monkeys; and 
testing of sound localization in cats. 

Dewson (1977) argues that the left hemisphere of rhesus monkeys is more important than 
the right for some forms of non-vocal auditory processing, referring to differential effects on 
performance after lesions of auditory cortex in the left and right hemispheres. The task 
which revealed these effects involved memory for sounds and/or colours. Monkeys heard a 
1 kHz tone or a burst of white noise when they pressed a panel. After a delay period of up 
to 20 seconds, two lower panels were lit up, one red and the other green, with the position 
of the colours unpredictable. The monkeys had to remember to select red after the tone 
and green after the white noise. Before brain operations they did this with few errors over 
delays of 1 or 2 seconds, but made more and more errors as the delay was increased during 
a test session, declining to chance levels of accuracy at delays of between 10 and 20 
seconds. Left-hemisphere lesions, aimed at removing the cortex from a limited area of the 
superior temporal gyrus, produced a long-lasting drop in accuracy of performance with 
delays of more than a second. Similar lesions in the right hemisphere, however, seemed not 
to affect the ability to remember at all. 

This seems to be a clear demonstration of left-hemisphere dominance with an area of the 
cortex linked to the auditory modality (Dewson et al., 1969). The main reservation is that 
only five monkeys provided the data. Two were given right-hemisphere lesions and showed 
no deficit, and three received left-hemisphere damage after which their performance 
suffered. One of the right-lesioned animals was subsequently given the operation on the left 
side as well, whereupon the deficit appeared. This is all consistent with the possibility that 
the delay test taps exclusively left-hemisphere processes, but not sufficient to exclude a 
statistical null hypothesis. An additional peculiarity of the monkeys employed was that, 
except for one of the left-hemisphere animals, they had previously been deafened in one ear 
by cochlear destruction, but the side of the deafening varied and had no observed influence 
on the results. Thus a left-hemisphere monkey showed a deficit and a right-hemisphere 
monkey did not when deaf in the ear on either the same side or opposite side. 

If we were to assume that, in any non-human mammal, left-hemisphere auditory cortex 
alone is required for some aspects of sound recognition or short-term storage of 
sound-linked information, we would ask whether there might be a corresponding 
specialization of right-hemisphere hearing. A possible candidate would be 
sound-localization of some sort. It would provide a nice simple theory if mammalian 
hearing assigned ‘what is it?’ and ‘where is it?’ questions to the left and right hemispheres 
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respectively. Such a clean disjunction does not occur, but there is almost as much evidence, 
in terms of numbers of subjects, for differential right-lesion effects on sound localization in 
cats as for left-lesion influence on recent sound memory in monkeys. There have been 
several reports of deficits in learned responses to localized sounds in cats with unilateral 
ablations of auditory cortex (Cranford et al., 1971; Whitfield et al., 1972; Whitfield et al., 
1978). Cats in these experiments run a Y-maze for food reward, which side is correct being 
signalled by tones from speakers behind the goal boxes. Simply running to the source of 
the only sound is a fairly rudimentary type of localization test, but accuracy at this task is 
severely impaired after bilateral lesions of auditory cortex, though not after unilateral 
damage (Neff ez al., 1956). A slightly more difficult discrimination 1s to go left to a tone on 
the left, but right when there are tones on left and right at the same time. On this ‘one 
versus two’ test, both left and right unilateral lesions have an effect, but only when the task 
requires running to the side contralateral to the lesion, in response to sound from both 
sources (Whitfield et al., 1978). An extra test given when ‘one versus two’ training had 
already taken place, involved the ‘precedence effect’ — the tendency of sounds from both 
left and right to be perceived as one sound, coming from the location of the first, if one 
precedes the other by about 5 ms. (Whitfield et al., 1972; Whitfield et al., 1978). 

In the initial reports, comparison of the behaviour of left- and right-lesioned animals 
‘revealed a very definite asymmetry’ (Whitfield et al., 1972, p. 26), with left-lesioned 
animals performing better than the right-lesioned cats on the one-versus-two 
discrimination, and especially on the precedence tests. However, most, if not all, of this 
differénce is attributable to the asymmetry in training rather than the site of the lesion. 
Whitfield et al. (1978) attribute any difference between left and right lesions to the direction 
of the training test, and individual variation. They characterize the basic deficit following 
unilateral lesions of auditory cortex as the lack of normal response to a compound stimulus 
localized on the contralateral side. This interpretation was roughly consistent with the 
performance of two surviving right-lesioned cats but not applicable to the performance of 
two of the three left-lesioned animals whose results are reported in the Whitfield et a/ 
(1978) paper. It seems generally the case that there are extremely marked individual 
differences after unilateral brain lesions of auditory cortex. Cranford & Oberholtzer (1976) 
found that two out of five cats with left-hemisphere lesions, and one out of four cats with 
right-hemisphere lesions, seemed to show improvement ın the precedence effects, while the 
others showed marked impairment. Individual animals, perhaps in response to physical 
hemispheric asymmetries such as those reported by Webster (1977), may favour one or 
other of the hemispheres for some sorts of binaural comparison, even if there is no 
species-characteristic left or right dominance. It can only be said, then, that possibility of 
hemispheric differences in the spatial aspects of hearing ın cats has not been completely 
ruled out. The four right-lesioned animals in the Whitfield et al. (1972) paper seem to have 
done rather badly, and this remains worthy of further investigation. The attraction, of 
course, is the supposed pre-eminence of the human right hemisphere in spatial matters. 
Although there is no agreement that the human right hemisphere 1s especially involved in 
the localization of sounds, a comparison of 78 human patients in all by Shankweiler (1961) 
suggested that right temporal damage impaired pointing to a sound location more than 
similar damage on the left. 

It must be concluded that evidence for hemispheric differences in hearing in mammals, 
not involving species-specific communication, is at present extremely slight, but not slight 
enough to demonstrate the absence of such differences. As there is already evidence for 
lateralization of singing in song-birds, and as there are anatomical asymmetnies in the 
external ears of some nocturnal species of owl, possible hemispheric specializations in avian 
hearing need pursuing. It has recently been shown that pigeons can localize brief sounds by 
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binaural disparities of time and intensity, in a manner analogous, if not homologous, to 
that of mammals (Jenkins & Masterton, 1979), and techniques are therefore available for 
determining whether there 1s any hemispheric dominance in these tasks. A more difficult 
undertaking would be the assessment of possible hemispheric differences in the 
echo-locating skills of porpoises and other Odontocete whales, but since the cranial 
asymmetries in these animals (referred to above) include dramatic differences in the size of 
the left and right jawbones, which are assumed by Erulkar (1972) to be the primary 
pathway for conduction of high-frequency sounds to the inner ears, speculation along these 
lines is not entirely groundless. 


Vision 


Since each hemisphere receives inputs from both ears via the ascending pathways and can 
thus independently survey the entire auditory field, specialization of one hemisphere for a 
demanding aspect of the analysis of auditory signals seems a plausible evolutionary 
development for any higher vertebrate species. Hemispheric specializations in other sensory 
modalities would be rather more surprising, as one would expect to find distinct 
asymmetries in performance. If, for example, we consider the consequences of a pigeon or 
rat developing a dominance for visual recognition of predators in the right hemisphere, we 
would be led to hypothesize a species peculiarly vulnerable to attack from its minor (right) 
visual field. No indications are to hand that such species exist, or that non-human 
vertebrate species with symmetrical bodies show obvious asymmetries in the effectiveness of 
perception for left and right sensory fields. 

However, in investigating the effects of unilateral brain stimulation on the emotional 
reactions displayed by doves to test stimuli, Vowles & Beazley (1974) report experimentally 
induced asymmetry of reaction: while receiving electrical stimulation of a site on one side 
of the forebrain, some doves would give fearful responses to a toy spider presented in the 
contralateral visual field, but aggressive responses when the identical stimulus appeared on 
the same side as the stimulation. Since birds have rather meagre connections between their 
cerebral hemispheres, and since each eye feeds most directly only to the contralateral visual 
centres, it is almost as 1f there must be separate control of emotional reactions to the 
information received by individual eyes. A similar sort of visual field effect has been 
observed in split-brained monkeys by Barrett (1969). Various midbrain and forebrain 
cross-connections ensure a degree of interaction between hemispheres, but Stevens & 
Klopfer (1977) have shown that some classically conditioned emotional responses to visual 
stimuli remained one-sided in experimental tests of interocular transfer in gulls, pigeons and 
chickens. For example, pigeons with one eye occluded, shown a distinctive cap-pistol before 
it was fired a few feet away, rapidly acquired the response of moving away from the pistol 
on sight, but showed no signs of recognizing it when only the previously occluded eye was 
available. The implication of such a finding is merely that the hemispheres may act 
independently, not that they have any built-in functional asymmetry; indeed, the greater 
the independence of the left and nght visual fields the less opportunity there is for 
functional specializations. 

In order to obtain measures of the independent visual activities of the hemispheres in 
mammais the cerebral commissures and/or the optic chiasma may be sectioned. The 
performance of human patients after such operations has, of course, been one of the main 
sources of evidence for human hemispheric differentiation. ‘Split-brain’ cats and monkeys, 
tested on visual discrimination learning tasks, have shown very few signs of imbalance in 
the learning abilities of the left and right parts of the visual system (Webster, 1977; 
Hamilton, 1977). In one experiment when a relatively abstract perceptual task was used 
(Robinson & Voneida, 1973), there was evidence of unequal hemispheric abilities in 
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individual cats, but the same number of animals (three) was inferred to have 
right-hemisphere dominance as left. 


Spatial and tactile learning 


Orientation in space, in such forms as maze learning by rats, or homing by pigeons, might 
be expected to reveal sensitivity to right-hemisphere lesions if anything akin to human 
minor hemisphere functions is lateralized in these species. There 1s a finding which connects 
right-hemisphere damage to peculiarities of movement in a novel environment for rats 
(Denenberg et al., 1978) but it is not straightforward and is considered to indicate an 
effect on emotionality if anything (see below). 

Tactile information may be part of the human right-hemisphere specialization observable 
in split-brain patients (Ledoux et al., 1977). Although split-brain monkeys have not 
manifested asymmetries for the learning of tactile discriminations (Ebner & Myers, 19625) 
it appears that a test more similar to those used with patients (feeling three-dimensional 
shapes) 1s sensitive to unilateral lesions of parts of somato-sensory cortex (SII — which 1s a 
bilateral projection), when the unilateral lesions are made in a monkey's ‘major’ 
hemisphere, determined by its hand preference (Garcha & Etlinger, 1978). The number of 
animals involved (three) does not allow any estimate of whether this effect 1s due to 
lateralization or whether it is a task for which both hemispheres are required. However, 
any bilateral learning deficit produced by a unilateral lesion in animals is unusual, and is 
open to interpretation as a sign of lateralization until it is clear that the side of the lesion is 
irrelevant. 

It should be noted that there are data from electro-encephalographic assessments of 
hemispheric activity in mammals. Nelson et al. (1977) using rabbits, Webster (1977) using 
cats, and Stamm et al. (1977) using stump-tail monkeys, have all observed differences in the 
electrical activity of the two hemispheres: during sleep (in the rabbits and cats), and also 
during visual discrimination learning (the cats and monkeys) and a simple auditory 
discrimination (in the rabbits). In no case, however, was there a systematic difference 
favouring the left or right side for a particular species. There are technical and theoretical 
problems of inference from electrical activity measures (such as movement artifacts): it has 
recently been suggested that the electrical activity of the human hemispheres does not differ 
significantly during the performance of cognitive tasks when all stimulus and response 
artifacts are removed (Gevins et al., 1979). 

In the areas of perception and cognition in animals therefore — in so far as these 
correspond to the limited range of conditioning and discrimination learning tests that are 
employed — the only signs of a species-characteristic dominance of one hemisphere over the 
other have occurred when macaque monkeys have been given a relatively difficult task 
involving the processing of auditory information. It has been reported that Japanese 
macaques may have right-ear advantage for the reception of species-specific calls (Petersen 
et al., 1978) and that rhesus monkeys are affected by left-hemisphere, but not 
right-hemisphere, lesions in tests of recent auditory memory for an audio-visual association 
(Dewson, 1977). Further investigation of the differential effects of unilateral lesions to 
auditory projection areas in the left and right hemispheres of mammals and birds appears 
warranted. 


3.4 Emotionality 


Perception and production of facial emotional expression, or emotionality more generally, 
has been assigned to the human right hemisphere. While facial expressions like our own 
may be observed in primates (Jolly, 1972), but not in other orders, the brain mechanisms 
associated with emotionality are not so restricted, and if either left or nght halves of the 
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limbic system had a special effect on movitation and emotion in any vertebrate, this should 
be readily observable. In general, it seems that either half of limbic structures is sufficient 
to control behaviour, and bilateral lesions are therefore necessary to produce behavioural 
impairments. Systematic assessment of hemispheric asymmetries in the control of emotional 
behaviour in animals is, however, rare. It 1s arguable that specific ratings of, for instance, 
aggressive behaviours, sexual responses and reaction to novel objects would be preferable 
to an overall index of emotionality. Gibson & Gazzaniga (1972), for instance, suggest there 
is a hemispheric difference for eating raisins in split-brain monkeys. But, for convenience, 
*emotionality' in rats has frequently been measured by placing the animal for the first time 
in a shallow box a yard or so square, and counting the amount of movement (and 
sometimes amount of defecation) in a given number of minutes. Animals which move very 
little and defecate a lot are said to be most emotional (see Gray, 1979). 

A carefully planned study employing measurement of amount of movement in this *open 
field’ test has indicated that the right hemisphere may affect emotionality more than the 
left in rats (Denenberg et al., 1978). The absolute size of the mean difference was 
considerable, but so was variability within the groups of animals tested and the right/left 
hemisphere differences appear only under certain conditions of laboratory housing. Rats 
were handled daily for the first 3 weeks of life, or not so handled, then further divided into 
groups reared together in large and interesting enclosures, and animals living in pairs in 
bare laboratory cages, for a further month. After these differences in treatment during the 
first few weeks of life the rats were all caged singly until they were 4 months old, when 
brain operations were performed, followed by the behavioural test. Either the left or the 
right hemisphere was partially removed by suction, or a sham operation was performed. 

In general, the enriched-environment rearing increased open field activity a little, as did 
postnatal handling. The handling was crucial, however, in determining reaction to brain 
damage. Unhandled animals increased activity as a consequence of either left- or 
right-hemisphere ablations. Handled rats were made either very active, or virtually 
immobile by right-hemisphere damage, and affected very little by the left-hemisphere 
operation. It was the handled and then group-reared animals who were immobilized, and 
the handled, cage-reared rats who were extremely active (Denenberg et al., 1978). Without 
special handling, Robinson (1979) found that damage to the right hemisphere, but not to 
the left, made rats more active both in the open field and in a running wheel. 

In similar experiments Sherman et al. (1979) have obtained a slightly more direct effect 
of right-hemisphere ablations on an aggressive response. Male rats were handled or not 
handled as neonates, as in the Denenberg et al. (1978) procedure, and tested for mouse 
killing, when adults, by the introduction of a mouse into their individual home cages. 
There was no difference between left- and right-hemisphere ablations for the unhandled 
rats. But with the handled animals, the group with mght-neocortex lesions took about 2 
days to kill the intruder, while the rats with left-hemisphere damage usually dispatched the 
mouse within the first 24 hours. This could be taken to indicate a greater involvement of 
the right hemisphere in aggression, or emotional reactions more generally in handled 
laboratory rats. 

It is odd that a functional asymmetry should appear only in rats subjected to disturbance 
in the first weeks of life. However, the research of Diamond et al. (1975) indicates that this 
may be a period when physical asymmetries in favour of the right hemisphere are at a 
maximum (see above). 

It would be unwise to conclude anything from these strange results (Denenberg et al., 
1978; Sherman et al., 1979) before replication. But effects such as these would be 
remarkable, if they are genuine, in that a species-characteristic left/right difference of any 
kind would be a major addition to the present rather flimsy collection of functional 
hemispheric asymmetries in animals other than man. 
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4. Conclusions 


On the basis of the reports of hemispheric asymmetries in non-human species reviewed 
here, it is difficult to reject the null hypothesis that the vertebrate nervous system is an 
entirely symmetrical device, with the possible exceptions of the brains of humans and 
canaries. In only these two species is there strong evidence that damage to one side of the 
brain has behavioural effects different from those which result when the other side suffers 
similar injuries. In many species individuals have reliable motor preferences for one side of 
the body, but only in man is the distribution of manual preferences non-random. Do the 
comparative data therefore support the proposition that functional lateralization of the 
brain is a uniquely human phenomenon, or the more specific contention that human 
cognitive abilities depend on, or are significantly assisted by, functional lateralization? Even 
if only the results which indicate that vocalization in male canaries is controlled by the left 
hemisphere are accepted, there are implications for theories of human cognition. The 
possession of fully human cognitive abilities cannot now be said to be a necessary condition 
for the appearance of hemispheric lateralization in the vertebrate brain. An alternative 
hypothesis is that emphasis on vocalization is associated with lateralization of function. An 
account of this phenomenon could be given by appealing to an ontogenetic and 
phylogenetic ‘law of least effort’: hemispheric space 1s not used if it can be managed 
without, and vocalization — unlike, for instance, locomotion — does not require the left-right 
differentiation usually achieved by the symmetrical employment of both sides of the brain. 

Of all the modalities, hearing 1s the one for which a single hemisphere is most generously 
supplied with direct access to inputs from the entire sensory field, and thus is the modality 
where similar hemispheric economies could most easily be applied to perception. The 
possibility that macaque monkeys show preferential use of the left hemisphere for some 
aspects of hearing gives support to this idea. If signs of left dominance for vocal production 
or reception in non-humans is rare, behavioural data to suggest alternative specializations 
in the animal minor hemisphere are even rarer, but the possibility remains that some (or 
many) vertebrate species may favour the right hemisphere for emotionality and spatial 
knowledge. Marked physical asymmetries in the limbic system of lower vertebrates with 
very small brains, and in the cranial bones of some marine mammals with very large 
brains, provide a basis for the speculation that asymmetries of forebrain function may be 
widespread in vertebrates. 

These few bits of evidence seem negligible when set against the volume of reports 
concerning human handedness, and lateralization of cognitive processes. However, if there 
is any form to the comparative evidence at all, it is that human asymmetries in vocalization 
and hearing may have evolutionary precedents, if not primate beginnings; whereas human 
right-handedness lacks any obvious animal precursors. If it becomes necessary to provide 
independent accounts of left-brain dominance of vocalization (as a vertebrate or primate 
tactic) and right-handedness (as a peculiarly human characteristic), then human tool-using 
is sufficiently different from that of other species to have supplied an entirely new selection 
pressure. More theoretical parsimony would be achieved by assuming that handedness is 
secondary to language — in other words that right-handedness arose only when 
vocalization, already lateralized, became associated with manual skills. 
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Memory for scripts with organized vs. randomized presentations 


Gordon H. Bower and Gail Clark-Meyers 





Memory for a word list is shown to depend upon wholist properties that emerge when it is organized 
in a manner congruent with semantic knowledge Subjects recalled 84 words exemplifying eight 
routine activities (scripts) such as attending a lecture. In Expt 1 recall was far superior and more 
organized when the words were presented organized into scripted activities within the day than when 
presented in random order. In Expt 2, recognition memory for the organized list exceeded that for the 
randomized list. False positive recognitions were highest to script-related lures for the orgamzed list 
but were highest to word-associate lures for the randomized list. Thus, the organization of words to 
be learned determines emergent memory structures which affect recall and recognition performances. 





In the following studies we are concerned with how a person's memory for a set of items 
depends upon the way they are presented to him. We have chosen a list of words which 
exemplifies a particular conceptual structure. The 1ssue is whether recall can be enhanced 
by presenting these items in such manner that their conceptual structure emerges. A 
secondary issue is whether subjects’ detection of the conceptual structure of the item set 
causes them to organize their recall by that structure. 

The item sets used here are words selected from several scripts or activity stereotypes. 
Schank & Abelson (1977) have used the word script to refer to the organized knowledge we 
have about routine activities such as riding on a bus, eating at a cafetena, cashing a 
cheque, and so on. A script has a standard set of roles (stock characters), props, a standard 
sequence of scenes or actions, standard results, and standard entering conditions. Thus, the 
cafeteria scene has customers, food-servers and cashier; its props include trays, cutlery, 
food-warmers, a cash register; and standard scenes are to enter, wait in line, get a tray and 
utensils, select foods from the display, then pay, select table, sit, eat, and leave. The usual 
entering conditions are that the customer is hungry and has money to buy food; the 
standard result of performing the script is that the customer is less hungry, has less money, 
and the cashier (or owner) has some of the customer’s money. 

In examining words referring to activities within scripts, we noticed an interesting 
feature. Except for a few ‘script-title’ words, the remaining words taken out of context do 
not seem to be highly associated. Thus, wax and lesson are not associates of one another, 
nor are they associated to fall and fireplace. The words individually have many meanings 
and may occur in many contexts. Yet, these words are all related in a ski vacation script. 
Similarly, call, dress, program, curtain can be connected through a theatre (or concert) date 
script, and break, temperature, table, pain can be connected through a doctor script. Via 
intersection in memory, all the words together activate a memory structure that interrelates 
them, and it seems to do so in a way that is only minimally related to pairwise, word—word 
associations. We may say that the script, as a superordinate mediator, emerges from 
reflection upon the set of words. 

Suppose we present subjects with 10 words from each of eight scripts and tell them to 
learn to recall these in any order that suits them. If we present the words clustered into 
Scripts, the presentation will look like that in Fig. 1. This shows the organized presentation 
used in our experiments. The eight scripts are conceived of as eight activities organized 
temporally within the day. The words within each script are ordered in approximately the 
order of the corresponding events or objects 1n the real-world activity. 

Presented with such organized word clusters, subjects should be able to detect the 
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Figure 1. The 84 words organized into eight 10-word scripts with time-within-day categories. 


underlying script structure.. They should then use that as a mnemonic for interrelating and 
remembering the words on the list. In comparison to a control subject learning 84 
unrelated words, the subject learning the organized script words has a number of 
advantages both in learning and in recalling. Whereas the control subject must concoct and 
remember a diversity of inter-item associations, the organized subject can use the script as 
a device for interrelating clusters of 10 words. His learning task is reduced to 

remembering the names of the eight scripts, and recalling only those items within each 
script which have a ‘list tag’ attached to them. (Something like a list tag is needed to help 
the subject discriminate presented from non-presented items of the script.) 

Subjects’ use of the scripts as mnemonics should be revealed by the organization of their 
free recall. That is, they will tend to recall the words of a script together, and they will tend 
to place the words within a script in their stereotypic order. The list structure thus forms a 
retrieval plan as well as a learning plan. The recaller knows how to begin her recall, how to 
proceed from cluster to cluster, from one word to the next within a cluster, and she should 
soon learn to detect when she has truly completed her recall. These are the marked 
advantages provided by use of the superordinate script structure of the list. 

Let us contrast learning within this organized condition to that in an ‘unorganized’ 
control condition. Here, the 84 words of Fig. 1 would be completely scrambled, with any 
word likely to appear at any position in the tree. The result of doing this is to obscure 
totally the script structure of the list. When examining such randomized lists, the naive 
subject can detect and use a few related words which emerge as clusters in recall, but these 
groups of words are too small to span the whole set. The result of studying such a 
randomized list, then, 1s a long memory set of weakly connected, small groups of words, 
with no concept relation between groups. Recall in such a case should be considerably 
poorer than in the organized conditions. Our first experiment was undertaken to check 
these predictions. 
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Experiment 1 
Method 


Materials. The list of 84 words was presented 1n organized fashion as in Fig 1. As Fig. 1 shows, the 
scripts were presented spatially from left to right so as to mimic a plausible temporal order of the eight 
activities within a day. Also, the words within a script were ordered from top to bottom in their 
stereotypic order. In this way, we maximized the organization of the presented word list. The 
randomized list comprised these same 84 words scrambled in their location within the tree structure 
of Fig. 1. 


Procedure. The two groups of subjects were run as two intact groups, seated in a classroom. The 
subjects received general learning and free recall instructions. They were asked to write their recall on 
lined paper in a clustered fashion, writing on one row of the paper the words they thought of as a 
cluster, then moving to the next line to record the next cluster. The ‘cluster’ could be one word if 
they wished. Buschke (1977) introduced this method of ‘two-dimensional’ recall which greatly 
facilitates observation of clusters in free recall. 

Subjects received four study-recall cycles on their list. The study words were printed black on a 
24” x 30° white poster board placed on an easel at the front of the class where it was easily read by 
all. The subjects studied the whole list for 4 min. The poster board was removed, and subjects had 
5 min to write their free recall. This was ample time for all subjects to complete their recall. As soon 
as a subject completed his recall, he put down his pencil and the experimenter collected his recall 
sheet (preventing further study) and gave him a clean sheet to be used for recall after the next study 
trial. The words were presented the same way, using the same poster board, on each study trial After 
the fourth study-recall cycle, subjects were debriefed and dismissed. 


Subjects. The subjects were 48 undergraduates at the University of California, Davis, participating for 
extra credit for their introductory psychology class They were run in two intact eo of 24, at the 
same time on successive days. 


Results 


The primary results are the average numbers of words recalled over trials for the two 
presentation conditions. These are plotted in Fig. 2. Here we see that the organized 
presentation produces a huge superiority in recall in comparison to the random 
presentation. An overall analysis of variance on recall scores finds a significant trials effect, 
and an even more significant organization effect (F = 353-2, d.f = 1, 46). The effect of 
organized presentation is apparent on the first recall trial (mean recall of 41 vs. 17 words), 
and it grows larger over trials (e.g. recall of 75 vs. 39 words on Trial 3). The interaction of 
trials with organization is significant, (F = 3-56, d.f. = 3, 138, P < 0-05). Thus, 
organization promotes both a higher initial recall and a faster rate of approach to the 
asymptote of learning. These effects are so large that tnals, organization, and their 
interaction accounts for 47 per cent of the variance in the recall scores. 

Recall by the organized subjects was truly remarkable. For example, on Trial 3, 15 of the 
24 organized subjects recalled all 84 words correctly. On Trial 4, 19 recalled perfectly. In 
contrast, for the random condition no subject ever attained perfect recall. 

The recall protocols were examined for evidence of clustering in output. Practically all 
the organized subjects mimicked the organization of the words as presented. Typical output 
was to write on one line the words remembered from one script, shifting to the next line for 
the words from the next script recalled. The scripts were usually recalled temporally in the 
temporal order of the scripts within the day. The words recalled within a script were very 
likely to be recalled in the presented order. The primary errors were omissions, leaving out 
a few words from a script or leaving out an entire script. Ás learning progressed, the 
output organization came to reproduce the input organization more faithfully. 

In contrast to the organized subjects, recall by the random subjects revealed considerable 
individual variation in the number and nature of clusters. The clusters (words per line) 
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Figure 2. Average words recalled over trials by subjects who study the organized versus the 
randomized list: —, organized, ——, random order. 


were initially smaller than those of the organized subjects, and the items within a group 
came from scattered locations on the presentation board. The words within a group were 
usually thematically related. Over trials, it appeared that the subjects began to pick up 
some of the script clusters underlying the list words. To document this point, however, 
would require more complicated-cluster analyses on the recall protocols than we have 
undertaken. For our purpose, it suffices to cite the main impression given by the 
protocols — that the organized subjects use a stable script organization to guide their 
recall, whereas the random subjects compose scattered, small, ideosyncratic word groups 
in their struggle to make sense of, and to remember, the randomized words. 


Discussion 


Clearly, organization of the learning materials had a tremendous impact upon our subjects’ 
performance. The effect is about as large as one would expect comparing recall of a 
coherent text with recall of the same words presented in scrambled order. The role of the 
script in suggesting an organization to our subjects is similar to the ‘theme’ as used in 
memory studies by Bransford & Johnson (1972) and Dooling & Lachman (1971). In 
their studies, subjects were read a prose passage which, though grammatical, made no sense 
unless a particular theme had been suggested (e.g. Christopher Columbus' voyage; washing 
clothes). The theme acts similarly to one of our scripts in that, for prose, it provides a 
consistent interpretation for ambiguous words and provides a way tó understand causal 
linkages between asserted actions. Bransford & Johnson, and Dooling & Lachman found 
that subjects given the theme before reading the obscure text understood it and recalled it 
far better than subjects not given the theme. That result is analogous to our result insofar 
as an organizing structure for the to-be-learned material is available to some subjects but 
not others. : 

A closer analogue to our script result is an experiment by Bower et al. (1969) using 
hierarchies of conceptually related words. Their conceptual hierarchies (e.g. minerals, 
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instruments) were presented in tree form, with nodes of the tree being names of subsets or 
instances of higher nodes in the tree. They found that subjects rapidly learned to recall 
large numbers of such words when they were presented in such fashion as to make manifest 
the underlying conceptual structure; however, learning was very slow if the words were 
randomized for every presentation. The words in Fig. | have an organization similar to the 
conceptual hierarchies used by Bower er al. The top-level node in Fig | is ‘daily calendar’, 
and the next-level nodes are organized by ‘temporal order of activities within day’; under 
these nodes script titles appear, followed by an ordered chain of events or objects within 
each script. And just as the subjects used the conceptual hierarchies to recall in the Bower 
et al., so did our subjects use the scripts of daily activities as a way to organize and recall 
all the words which appear on the study list. 


Experiment 2 


The preceding experiment demonstrated the advantage for free recall of having an organized 
structure subsuming the list. The conceptual structure of the list apparently serves as a 
retrieval plan for the words which are to be remembered. Let us now consider recognition 
memory tests which are less taxing upon retrieval. Suppose the person is tested by having 
to decide whether single words were presented or not presented on the input list he studied. 
For some subjects, the study list had been organized into scripts as in Fig. 1; for other 
subjects, the words on the study list had been randomized. i 

The performance predicted on the recognition test for these two conditions hinges 
critically upon the lures used as false test items. Consider first the case where we use lures 
that are semantically unrelated to all the list words. In such a case, 1f the organized subject 
knows the scripts in his study list, then he can reject most of these lures because they do 
not fit any of the scripts of the list This would be analogous to a subject rejecting, say, a 
name of a car when asked if it occurred in a list he knows consisted only of colour names. 
We would expect that subjects studying the organized presentation of the list would be 
more willing to reject unrelated lures than would subjects studying the randomized word 
list. Moreover, organized subjects should perform somewhat better at correctly identifying 
list words on the recognition test because only words closely related to the scripts were on 
the list. Thus, indexing discriminative performance by correct acceptances (‘hits’) minus 
incorrect acceptances (‘false alarms’), organized subjects should outperform random 
subjects. This should occur because the unrelated-lure recognition test makes script 
membership of a test word redundant with study list membership of that word. 

Consider next the case where the lures consist of other words which fit within the context 
of the scripts, but were not presented on the study list. For example, non-list lures 
appropriate to the skiing vacation script are parka, mountain and wet; lures appropriate to 
the doctor script are emergency, wait, examination and nurse. With such lures, we expect 
subjects receiving the organized study list to have a much harder time discriminating list 
from non-list words. If the subject only remembered which scripts were exemplified by the 
study list, he would perform at the chance level on such a test. We expect actual subjects to 
do better than this because they tag which words occurred within the scripts on the list. 
However, organized subjects should give more false positive responses to script-related 
lures than in the former case with unrelated lures. Moreover, their index of discrimination, 
hits minus false alarms, should be lower with script lures than with unrelated lures. In 
contrast, for the random subjects, script lures should elicit hardly any more false positives 
than do unrelated lures, because for random subjects the scripts are not psychologically 
present. This is to say that a script word 1s related to another word in its script only by 
virtue of arousing the same script — otherwise, they are seen as unrelated individual words 
in isolation. Thus, the discriminative performance of random subjects on script lures should 
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be close to what they show on unrelated lures. The experiment below looks for this 
predicted interaction between organized vs. random study list and script vs. unrelated lures 
on the recognition memory test. 

If the word which occurs in a script is presented in isolation and free associations are 
collected, we have found that subjects give many associations that have no relation to the 
script meaning of the word. Thus, the word /ift means one thing in a skiing script but when 
given in isolation or in a random context completely different associates come to mind (e.g. 
elevator, hitch-hike, weight). This suggested that the random subjects might find the high 
associates to single list words to be particularly attractive lures. Thus, we took high 
associates of our list words, generated by a normative group, and used these as the lures on 
a third recognition test. The expectation was that the random subjects should give more 
false positive responses to these associates than to the script or unrelated lures. Moreover, 
random subjects should give more false positive responses to these lures than will the 
organized subjects. In fact, discriminative performance of the random subjects should be 
poorest on the test with these high-associate lures. 


Method 


Design. The six independent groups can be arranged into a two by three design. The first factor is 
organized vs. randomized study list. The second factor 1s the nature of the lures on the recognition 
test. These were either all script lures, all high associates, or all words unrelated to the study list. 


Materials. The list of words to-be-learned was increased to 110 to keep recognition scores below 
ceiling for the organized conditions. These lists were comprised of the eight scripts used in Expt 1, 
but with the addition of three more words per script and two additional node words. The 110 words 
were presented as in Expt 1, printed on a poster board and shown for 4 min in front of the 
classroom. Three forms of the recognition test were composed. They all contained the same 25 list 
words (selected at random from the 110 presented), and 50 lures that differed between tests. The lures 
on one test were script lures — nouns that were frequently produced in the scripts that had been 
generated by a normative group of subjects but were not on the study list. Another test contained 50 
lures (nouns) that were chosen to be semantically unrelated to the list words. The third test contained 
50 lures that had occurred as high-frequency associates to the list words when the latter were given as 
1solated stimuli to a norming group of 40 University of California, Davis, undergraduates. All three 
tests listed the 75 words in random order down the left side of an answer sheet on which the subject 
checked an 'old' or *new' box beside each word to indicate his judgement whether the word had 
been presented in the study list or not. 


Procedure. The 30 subjects within a given condition were run together as one intact group. Following 
general ‘study’ instructions, the printed word list was shown on the posterboard in front of the group 
for 4 min. After the study period a mental anthmetic task was given. Subjects were asked repeatedly 
to multiply a two-digit number by a one-digit number in their heads and write down the answer 
directly. This occupied them for 20 min, allowing some forgetting to occur so recognition accuracy 
would be at an intermediate level. After this interpolated activity, the recognition test was handed out 
and instructions were given on how to fill it out. Subjects were not told the nature of the lures nor 
the number of old (25) and new (50) items on the test. Subjects were given as much time as they 
needed to fill out the recognition test, turning in their papers when they completed the test. 


Subjects. The subjects were 180 undergraduates from University of California, Davis, as before. They 
were tested in six 1ntact groups of 30, all within 3 successive days. 


Results 


The primary results are the average percentage of correct acceptances of ‘old’ words 
(‘hits’, abbreviated 4) and of false acceptances of ‘new’ words (‘false alarms’, abbreviated 
f). These measures are shown in Table 1 for the six experimental conditions. 

Looking first at the hit rate, we see that subjects studying the organized list have a 
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Table 1. Correct recognition (h), false positive responses (f), and discrimination index (d^) 
for the six experimental conditions 











Script lures Association lures Unrelated lures 

Organized 

(A) 0-84 0-83 0-83 

(f) 0:31 0 13 0 04 

(d^) 1-54 2-07 2-69 
Randomized 

(h) 065 0:62 0-65 

0) 0:17 0-22 011 

(d^) 1:38 1-09 1-64 








higher hit rate (mean of 0-83) than do those studying the randomized list (mean of 0-64). 
This difference is statistically reliable, with F — 260, d.f. — 1, 179, P « 0:01. On the other 
hand, hit rate is not affected by type of lures nor by the interaction of type of lure with 
organization of the presentation. So, organized subjects had a higher overall hit rate than 
did randomized subjects. 

Second, the false positive rate (fın Table 1) is greatly affected by the type of lures 
(F = 278, d.f. = 2, 179, P < 0-01) but not simply by the organization of the study list 
(F = 0-7). Overall, the most false-positive recognitions occur to the script lures and the 
least to the unrelated lures. The ordering of false alarms to the script lures vs. associate 
lures depends upon the way the study list was presented. For organized subjects, who 
presumably were aware of the script-nature of the study list, script lures attracted the most 
false recognitions, about two and a half times more than the associate lures. In contrast, 
randomized subjects gave more false positives to associate lures than to script lures. For 
false alarm rate, the interaction between organization of presentation and type of lure 1s 
statistically significant, (F = 164, d.f. = 2, 179, P « 0-01). There are several sources of this 
interaction. First, false positives to script lures exceeds that to associate lures for the 
organized subjects, but the reverse occurs for the randomized subjects. Second, with script 
lures, organized subjects false alarm more than do randomized subjects. With unrelated 
lures the reverse ordering is observed, with randomized subjects giving more false alarms 
than organized subjects. 

These interactions in false-alarm rate are consistent with our predictions. Subjects 
studying the organized list should have detected its script structure. That in turn 
should make them more likely than controls to falsely recognize script lures, but less likely 
than controls to falsely recognize unrelated lures that are thematically inconsistent with the 
study scripts. For the randomized subjects, on the other hand, high associates of the 
isolated list words were expected to be more attractive as recognition lures than were script 
words, because the underlying scripts were activated weakly (if at all) by the scrambled list 
these subjects studied. Thus, a subject who reads wax in isolation or in a random context is 
more attracted to mop or floor as a memory lure than to slope or pole, whereas 1n the 
context of a skiing script slope and pole are much more attractive as lures than are mop and 
floor. 

Conclusions about memory sensitivity, or recognition memory ‘corrected for response 
bias’, depend upon which theoretical model of recognition performance is adopted. A 
simple discrimination measure is the hit rate minus the false-alarm rate, which indicates the 
extent to which the attractiveness of ‘old’ items exceeds that of ‘new’ lures. By this index, 
organized subjects discriminate better than do randomized subjects; and unrelated lures 
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yield higher discrimination than associate lures, which in turn are higher than script lures. 
There is also an interaction in degree of discrimination, with the advantage of organized 
over randomized subjects being considerably less for script lures than for unrelated and 
associated lures. The conclusions about the -f discrimination index are a composite of the 
statistically reliable effects found for hits alone and for false alarms alone. 

A more complex measure of recognition memory is the d of signal detection theory (see, 
for example, Murdock & Duffy, 1972) which supposedly estimates the scaled mean 
difference between the strengths (or ‘familiarities’) of ‘old’ and ‘new’ items, independently 
of the subject’s criterion for accepting an item as ‘old’. If we assume that the strengths of 
‘old’ and ‘new’ items have equal variances, then d can be estimated for each subject from 
his hit rate and false alarm rate. The averages of these individual d’s are shown in Table 1. 

The statistical conclusions to be drawn from these ď measures are the same as for the 
h-f index of memory. First, d' is higher overall for organized than for randomized subjects 
(F = 189, d.f. = 1, 179, P < 0-01). Second, overall ď is highest for unrelated lures and 
lowest for script lures (F = 74, d.f. = 2, 179, P < 0:01). Third, £ reveals an interaction 
between organization of the study list and type of lure. The œ index of the organized 
subjects is considerably larger than that for the randomized subjects on the unrelated and 
associate lures, but not on the script lures (F = 254, d.f. = 2, 179, P < 0-01). Thus, the 2 
analysis has not altered the conclusions about recognition accuracy based on the 
hits-minus-false-alarms index. 

We may look at the data in the rows of Table 1 from the hypothetical perspective that 
they were generated by a single group of subjects who were tested with a mixture of old 
items, script lures, associate lures, and unrelated lures. Setting the strength scale and 
criterion so that the hit rate and false alarms to unrelated lures are matched, the mean 
strengths estimated for the four types of items for the organized subjects are 0 for unrelated 
lures, 0-62 for associate lures, 1:25 for script lures, and 2-70 for old items. For the 
randomized subjects, the corresponding strengths are 0 for unrelated lures, 0-28 for script 
lures, 0-46 for associate lures, and 1-59 for old items. This hypothetical picture simply 
shows on a common scale (namely a’) the significant sources of variance in discriminative 
performance: organized subjects remember ‘old’ items better than do randomized subjects; 
script and associate lures are somewhat similar in strength to *old' items, and their 
ordering reverses for organized vs. randomized subjects. 

In conclusion, the recognition results have confirmed our predictions in a striking 
manner. Subjects reading the organized word list have the relevant script activated, and 
they are assumed to remember the list by (1) remembering the scripts on the list, and (2) 
tagging those items ın the memory script corresponding to ones presented in the study list. 
These two sources of information are employed differently in the recognition test. If the 
lures are unrelated, then information about the list scripts alone will suffice to reject such 
lures. This strategy is not available to the randomized subjects who have no compact 
characterization of the items on the word list. But if the lures are unmentioned items from 
the scripts list, then the list tags mentioned above for the organized subjects must be used to 
discriminate mentioned from unmentioned script items. Since this list tag is assumed to 
fade and be forgotten, script list items will be nearly indistinguishable from unmentioned 
lures as time passes. 

The results are consistent with the idea that the person will later falsely recognize items he 
thinks of while he studies the list, and further that the associates he thinks of when 
studying the words depend upon the context (or sequence) in which they are presented. In 
the present case, we were able to predict that the most highly evocative lures would be 
script lures when the word presentations were organized by scripts, but were high 
associates of isolated words when the words were presented in random fashion. 
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The recognition results along with the earlier recall results demonstrate the ‘ psychological 
reality’ of scripts for memory experiments. The results are rather like what is obtained 
comparing memory for organized lists of categorized words to unorganized lists of 
unrelated words. A difference is that the effect of ‘blocking’ or structuring the input words 
(vs. randomizing them) is considerably larger for scripts than for words belonging to 


taxonomic categories. 
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Semantic and structural coding of the months 


Philip H. K. Seymour 





Semantic and structural properties of the months were investigated in a descriptive study, and in a 
series of experiments involving Yes-No decisions about month names The data were characterized 
by boundary effects (increases 1n RT near transitions between structurally or semantically defined 
divisions of the series) and by variations in the size of a response effect (difference between Yes and 
No RT). This month x response interaction was interpreted as an influence of polarized semantic and 
locative attributes on judgemental and response processes. 





The names of the months of the year form an interesting, though little studied, structure in 
lexical and semantic memory. One aspect of this structure is formal in character, and 
represents the months as a cyclic and ordered series having a fixed first and last member 
(Leech, 1969). This will be referred to as the structural coding of the months in the 
discussion that follows. It underlies the use of the month names as a system for labelling 
calendar time, and is common to members of all cultural groups who employ the 
Gregorian Calendar for this purpose. x 

A second aspect is concerned with the mapping of the months on to phases of a 
seasonally and culturally determined annual cycle in which consistent and repeated changes 
of climate, agrıculture and individual activity are experienced. This is a rich associative 
structure, incorporating connotations and symbols which have often been exploited in 
literature and the arts. It will be referred to here as the semantic coding of the months, and 
is viewed as a representation of scenic and emotive properties of phases of the seasonal 
cycle. 


Experiment 1 


The first experiment was part of a larger descriptive investigation of categories in semantic 
memory which has been discussed more fully by Seymour (1976). A group of 100 volunteer 
subjects answered questions about 10 categories, the names of the months and the names of 
the seasons being two of these. The questionnaire requested that they should give colour 
name and other verbal associations to the individual month and season names, and that 
they should indicate which months fell within each season. 


Colour associations of the months. The existence of a communality in the colour 
associations evoked by the month and season names has been commented on by Seymour 
(1976, 1977). Table 1 indicates the number of subjects associating each month with each of 
the 11 colour names which occurred most frequently in the response protocols. A seasonal 
trend was evident in the associations, and this has been reflected in the grouping of the 
colour names in Table 1. In general, the winter months, November-February, attracted a 
predominance of achromatic associations (especially white and grey), whereas the summer 
months, May-August, attracted colourful associations ın which yellow and blue 
predominated. These associations occurred for months falling in the transitions between 
summer and winter, but the spring months were also associated with green, and the 
autumn months with the colours brown and gold. 


Seasonal membership of the months. Listings of the months assigned to each of the four 
seasons varied from subject to subject. However, a consensus was evident when the months 
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Table 1. Frequencies of colour associations for the month names 














Colour Mar. Apr. May June July Aug. Sep. Oct. Nov. Dec. Jan. Feb. 
Spring colours ,.. 

green d4 24 17 ll 5 2 l 2 2 0 l 

pink, I 3 8 3 0 2 I 0 0 0 0 0 
Total 15 27- 25 14 5 4 1 2 2 0 1 
Summer colours i 

yellow 8 14 9 13 15 10 | 9 2 l NE! 2 5 

blue 10 9 9 I1 11 7 3 2 5 1 4 9 

red I 2 3 6 8 9" 4, § I 7 5 2 

orange 0 0 l 6 6 8 3 6 0° 0 0 0 
Total 19 25 22 36 40 34 19. 15 7 9 ll 16 
Autumn’ colours 

brown ^ 3 cd. 0 0 14 19 30 9 3 l 4 

gold 1 0 0 2 3 9 8 2 0 0 0 0 
Total 4 1 0 2 4 23 27 32 9 3 1 4 
Winter colours 

white 7 3 3 I 0 1 3 3 3 37 39 5 

grey 9 5 0 2 1 1 3 3 26 6 1 20 

black 1 0 0 0 0 0 I 2 8 10 3 1 
Total 17 8 3 3 1l" 2 7 8 37 53 53 26 
Other colours « | 2 4 3 4 3 3 3 2 5 6 3 4 
Null reports 43 35 47 4] 147 34 38 42 40 27 32 45 


L 


Table 2. Frequencies of mention of months as first members of each season 


Season Jan. Feb. Mar. Apr. May June July Aug. Sep. Oct. Nov. Dec. 


t 


Spring 3 13 68 15 1 Be 








Summer — — — 1 16 76 3 1 — — — am 
Autumn = — — —. — — 1 11 75 12 — — 


Winter 10 = Se = — ae — — 2 $% 4l 





` 


were classified ın terms of the frequency with which they were used as the first member of 
each season. This is shown in Table 2, where it can be seen that a majority of subjects saw 
March as the start of the British spring, June as the start of summer, and September as the 
start of autumn. This consensus broke down in the case of winter, which was seen as 
starting in November by some subjects, and in December by others. 


Climatic and vegetative associations. The verbal associations (other than colour names) 
given in response to the month and season names were inspected and were found to fall 
into a number of distinct categories. These are shown in Table 3 which indicates the 
number of subjects, responding within each category to each of the four season names. It. 
can be seen that the associations referred to (1) calendar events (e.g. Christmas, Easter), (2) 
social and psychological conditions (activities, places, objects and moods), and (3) the 
natural cycle of the seasons (climatic conditions and plant and animal life). Details of the 
content of the responses and of the responses to the individual month names have been 
given by Seymour (1976). 
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Table 3. Classification and frequency count of the verbal associations of the season names 











Spring Summer Autumn Winter 
Calendar 8 l 1 29 : 
Social-psychological i 
moods 52 57 51 41 
activities 14 40 1 18 
things, places 20 48 23 ' $0 
Climate 55 76 49 82 
Vegetation 58 24 73 22 
Animals 34 9 8 5 








Note. Frequencies > 6i and « 33 may be M apum as significantly high or low by a binomial test 
(P « 0-001, two-tailed). 


The present discussion will be limited to the climatic and vegetative associations of the 
seasons. Climatic associations predominated in the responses to winter and summer and 
reflected a contrast on a dimension of temperature (HOT +> COLD). Climatic references were 
made by about half of the sample for spring and autumn, and involved mention of 
freshness, brightness and rain for spring, and of wind, rain and fog for autumn. Vegetative 
associations, on the other hand, occurred with average or high frequency for spring and 
autumn, but with low frequency for summer and winter. This suggests that the spring- 
autumn contrast is based on a dimension of vegetative change (plants GROWING + plants 
DYING). 


Experiment 2 


The second experiment involved a more detailed investigation of the semantic coding of the 
months on the dimensions of temperature and vegetative change. On each of a series of 
trials the subjects were presented with a temperature statement (WARM WEATHER, COLD 
WEATHER) or a vegetative statement (PLANTS GROW, PLANTS DIE) followed by the name of a 
probe month. They were instructed to respond ‘Yes’ if the statement was true, of the 
month, and ‘No’ if it was not. For the purposes ‘of the study, the months April-September 
were defined as a period of warmth and growth of plants, and the months October-March 
as a period of cold and death of plants. f 

. The experiment was undertaken with two main aims in view. The first was to test the 
hypothesis, derived from a study of the response frequencies in Expt 1, that the climatic 
dimension of temperature is more salient for the months than the vegetative dimension of 
growth. Production frequency has been shown to predict verification reaction time (RT) in 
other semantic classification tasks (Wilkins, 1971; Loftus, 1973), and it was therefore 
expected that verification of temperature statements would show an RT advantage over 
verification of vegetative statements. 

The second intention was to explore the hypothesis that the poles of the temperature and 

. growth dimensions carry positive and negative connotations. Recent linguistic and 
psychological analyses have suggested that antonymous pairs, such as large-small, 
up-down, deep-shallow, often involve,a positive-negative opposition (Hamilton & Deese, 
1971; Huttenlocher & Higgins, 1971), which has been related to the linguistic distinction 
between unmarked and marked terms (Bierwisch, 1967; Clark, 1974). The marked 
members of these oppositions are frequently expressive of absence or lack of extent on the 
dimension in question, and it is in this sense that they may be viewed as negatives 
(Huttenlocher & Higgins, 1971). If ‘cold’ is a negative of ‘warm’, and ‘die’ is a negative of 
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‘grow’, an implication is that the winter months, October-March, will be assigned a 
negative semantic coding. The central months, April-September, by contrast, possess the 
unmarked properties of *warmth' and ' growth', and may be considered to be semantically 
positive. The colour associations fit into this account, being colourful for the central 
months and colourless for the winter months. 

Seymour (1975) has argued that the positivity and negativitiy of unmarked and marked 
terms may influence the judgemental and response stages in tasks involving Yes-No 
decisions. Positivity is facilitating for Yes decisions and responses, and inhibiting for No 
decisions and responses, and the reverse is true of negativity. Hence, in a task involving 
semantic judgements about the months, we would expect to observe an interaction between 
Yes versus No responses and the segment of the year (central or peripheral) from which the 
probe month was taken. When central months are classified, Yes responses should show an 
advantage over No responses, but this effect should diminish or reverse when end months 
are classified. 


Method 
Subjects. The subjects were 10 volunteers from undergraduate classes at the University of Dundee. 


Apparatus. The queries and probe month names were presented to the subject by means of a VR14 
CRT display which was operated by a PDP 12 laboratory computer. The words were displayed in 
standard half-size upper case characters at the centre of the screen. The two words of the queries, 
WARM WEATHER, COLD WEATHER, PLANTS GROW and PLANTS DIE, were displayed one above the other 
for 2s, followed, after delay of 25, by presentation of the name of a month. This remained on the 
screen until the subject made a vocal ‘Yes’ or ‘No’ response, and the computer clock was used to 
time this interval in ms. The computer detected the response when the relay of a voice-key was 
closed. 


Procedure. On arrival at the laboratory, the subjects were given an explanation of the assignment of 
climatic and vegetative properties to the months. They were told that a positive Yes response should 
be made to the months APRIL-SEPTEMBER following the queries WARM WEATHER and PLANTS GROW, 
and to the months OCTOBER-MARCH following the queries COLD WEATHER and PLANTS DIE, and that a 
negative No response should otherwise be made. The collection of data took place 1n two sessions, 
each of about 30 min duration, involving a practice sequence of 12 trials followed by 144 test trials. 
Within each session the 48 query x month combinations created by combining four queries with each 
of 12 months occurred three times. 


Results 

Error frequencies. Errors occurred on approximately 6 per cent of trials overall. This aspect 
of the results has been summarized in Table 4 which shows error totals classified by probe 
month, target dimension (temperature, vegetation) and correct response (Yes. or No). The 
error scores demonstrate the occurrence of a boundary effect (a tendency for error frequency 
to rise at the March-April and September-October transitions). Application of a Friedman 
one-way analysis of variance to the error scores indicated that the effect was significant for 
both the April-September and the October-March segments of the (y& — 22:2 and 15-76, 
d.f. = 5, P < 0-01). 

The error scores also show an interaction between months and Yes-No responses. When 
responding to the central months, April-September, subjects erroneously responded Yes on 
65 occasions, whereas erroneous No responses occurred on only 42 occasions. This bias 
toward false positive responses was significant by a Wilcoxon test (T = 7, P < 0-05). For 
the October-March series, by contrast, incorrect Yes responses occurred on 25 occasions, 
as against a total of 53 erroneous No responses. The bias in favour of false negative 
responses was also significant by the Wilcoxon test (T = 3:5, P < 0-02). 
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Reaction times. A mean RT was calculated for each subject for each of the 48 

query x month combinations. This was based on the six or fewer RTs available after 
exclusion of errors and occasional voice key failures. The RT results have been summarized 
in Table 4 and have been classified by query, probe month, and response. The results for 
the April-September and October-March segments were submitted to separate analyses of 
variance in which the factors were (1) dimension (temperature, vegetation), (2) response 
(Yes or No), and (3) probe month. 


Table 4. Mean RTs (ms) and error frequencies (totals) for vegetative and climatic 
classification of the months 


Central Months 








Apr. May June July Aug Sep. X 











Growth 
Yes 
RT 859 858 835 857 875 1046 888 
Errors 3 2 I 1 4 9 20 
No 
RT 974 940 878 876 1023 1009 950 
Errors 3 4 2 2 6 10 27 
Temperature 
Yes 
RT 1026 788 784 798 803 930 855 
Errors 11 2 0 l 0 8 22 
No 
RT 1077 946 865 874 930 1004 949 
Errors 15 6 2 l 5 9 38 
End Months 
Oct. Nov. Dec. Jan. Feb. Mar. X 
Growth 
Yes 
RT 1097 854 836 903 916 95] 926 
Errors 7 0 2 4 1 7 21 
No 
RT 994 897 834 927 982 1080 952 
Errors 4 0 1 0 2 5 12 
Temperature 
Yes 
RT 971 832 864 851 867 992 896 
Errors ll 4 3 1 4 9 32 
No 
RT 1062 901 812 830 878 1053 922 
Errors 6 1 0 0 0 6 13 








Months effect. Inspection of the data will indicate that the boundary effect noted in the 
analysis of errors was also strongly present in the reaction time data. RTs were in general 
greater at the March-April and September-October transitions than at the intermediate 
points, and this effect was significant for the central and end months (F — 11:38 and 
17-97, d.f. = 5, 45, P < 0-001). However, the months effects interacted with responses and 
dimensions for both series, and these interactions will be discussed more fully below. 
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Dimension effect. It was argued above that the normative data suggested that queries 
about temperature should be processed faster than queries about vegetation. However, the 
advantage for temperature queries was only 17 ms for the central months, and only 30 ms 
for the end months, and these differences were not significant in the analyses of variance 
(F = 1-47 and 1-46, d.f. = 1, 9). For central months, the interaction of dimensions by 
months was significant (F = 7-13, d.f. = 5, 45, P < 0-001). It can be seen from Table 4 that 
this was because the advantage for temperature queries was only about 30 ms for May, 
June and July, 83 ms and 60 ms for August and September, and that it reversed in favour 
of growth queries for April. In the case of the end months, the three-way dimension x 
response x months interaction was significant (F — 3-41, d.f. — 5, 45, P « 0-01). 

Inspection indicates that there was a temperature advantage for January and February, 
amounting to about 50 ms for Yes responses and about 100 ms for No responses, little or 
no difference for November and December, and a dimension x response interaction for 
October and March. 

Response effect. It can be seen from Table 4 that Yes responses to central probe months 
were in general faster than No responses. The response effect was about 78 ms overall, and 
was significant when tested by the analysis of variance (F — 13-04, d.f. — 1, 9, P « 0-006). 
However, in the case of the end months, the response effect was only 26 ms, and this was 
not significant (F = 1-46, d.f. = 1, 9). This outcome parallels the results of the error 
analysis. We can see a bias toward Yes responses in the advantage of the Yes RT and in 
the frequency of false positive responses observed for the central months. In the case of the 
end months, there is no Yes-No effect, and the error data show a bias toward false 
negatives. 


Experiment 3 

The results of Expt 2 were in general consistent with the proposal that the semantic coding 
of the months on dimensions of temperature and plant growth incorporates a 
positive-negative bipolarity which influences the Yes-No decision or reponse process. If 
this interpretation is correct, the occurrence of a months x response interaction can stand as 
an indication of the involvement of semantic codes in the judgemental process. 

According to this argument, the months x response interaction should disappear in tasks 
which allow the subject to dispense with retrieval of the semantic codes when making his 
judgement. Three situations were examined in which it seemed likely that this might occur. 
In the first, subjects were presented with a number name, ONE, TWO. . . TWELVE, folllowed by 
the name of a probe month, and were instructed to respond ‘Yes’ if the month occupied 
the stated position in the sequence, and ‘No’ if it did not. In the second, the queries used 
were CENTRE and END, and the subjects were told to respond ‘Yes’ if the probe month fell 
within the specified segment of the year and * No' if it did not. This latter task 1s formally 
equivalent to the temperature and growth task (Expt 2) since it involves discrimination 
between a central subset of months, April-September, and an end subset, October-March. 

It was considered that these two verification tasks could be carried out by accessing the 
structural coding of the months in which the formal locative properties of the series are 
specified. If this coding system is independent of the semantic coding system, and if ıt does 
not incorporate a positive-negative polarization, the response x month interaction 
demonstrated in Expt 2 should be eliminated. 

A third task was examined in which the queries were the season names, SPRING, SUMMER, 
AUTUMN and WINTER, and subjects responded ‘Yes’ to probe months falling within the 
designated season and ' No' to those falling outside. Seymour (1977, Expt IV) reported 
results of a study in which subjects responded to month names by naming the seasons to 
which they belonged. Printing the month names in seasonally congruent or incongruent 
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colours had no effect on RT in this task (although this factor was important in other 
situations). A possible interpretation is that months are related to their season names by a 
formal structure of relationships which is independent of the semantic code. If so, we would 
not expect to observe response x month interactions in the season verification task either. 


Method 


Subjects. Three groups of subjects were tested. Sixteen of these were volunteers from classes at the 
University of Dundee Eight were tested in the location verification experiment, and eight in the 
centre-end verification experiment. À further group of eight volunteers from the Sixth Form of Grove 
Academy, Dundee, took part in the season verification experiment. 


Apparatus. This was as for Expt 2. 


Procedure. The subjects in each group were given an explanation of the assignment of the months to 
the relevant categories. These were the locations ONE-TWELVE for the first group, the CENTRE 
(April-September) and END (October-March) segments for the second group, and the seasons 
SPRING, SUMMER, AUTUMN and WINTER for the third group. For the purposes of the experiment spring 
was defined as March-April-May, summer as June-July-August, autumn as 
September-October-November, and winter as December-January-February. The results of Expt 1 
suggest that this may involve a misclassification of November, but it was desirable from the 
standpoint of design of the experiment to assign three months to each of the four seasons. 

Following practice the subjects in each group classified a randomized sequence of query x month 
combinations. In the location verification task there were 144 trials, with each month name occurring 
six times as a positive instance and six times as a negative instance. On the negative trials the number 
name differed from the probe location by one to three steps in either direction. The centre-end task 
involved 96 trials, in which each of the possible 24 query x month combinations occurred four times. 
There were 144 trials in the season verification experiment, such that each month occurred as a 
positive instance of its season on six occasions, and as a negative instance of each of the other seasons 
on two occasions. 


Results 


Error frequencies were low in these experiments, amounting to less than 1 per cent in 
location verification, 2-5 per cent in centre-end discrimination, and 2 per cent in season 
verification. Subsequent analyses concentrated on the RTs for correct responses. 


Location verification. In the location verification experiment the mean Yes RT was 622 ms, 
and the mean No RT was 696 ms. The Yes-No effect of over 70 ms was significant when 
tested by analysis of variance (F — 19-71, d.f. — 1, 7, P « 0-003). There was no effect of 
months, and no response x months interaction (F « | in each case, d.f. = 11, 77). The No 
RTs were classified with respect to the numerical difference between the location of the 
probe month and the number presented as a query. The RTs were 702 ms for a single-step 
difference, and 691 ms and 676 ms for two- and three-step differences, but the distance 
effect was not significant in an analysis of variance (F « 1, d.f. = 2, 14). 


Centre—end verification. Table 5 presents the mean RTs and error frequencies observed in 
the centre-end verification task. Inspection will suggest that the boundary effect noted in 
Expt 2 also occurred in this study, and that there was a Yes-No difference for the central 
months, April-September, but not for the end months, October-March, again paralleling 
the results of Expt 2. 

Separate analyses of variance were carried out on the April-September and 
October-March series, with months and responses as factors. The months effects were 
significant in both cases (F = 6-17 and 9-21, d.f. = 5, 35, P < 0-001), chiefly on account of 
the increases in RT noted at the March-April and September-October boundary locations. 
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Table 5. Mean RTs (ms) and error totals for centre-end classification of the months 
Central Months 





Apr. May June July Aug. Sep X 

















Yes 1126 1019 951 852 978 1138 1011 
No 1204 1154 1016 994 1140 1204 - 1119 
Error totals 2 1 0 1 2 5 

End Months 

Oct. Nov. Dec. Jan. Feb. Mar. X 

Yes 1144 1015 914 959 965 1164 1027 
No 1092 1022 987 959 974 1233 1044 
Error totals 2 0 0' 1 0 5 





For the central months, Yes responses were made about 100 ms faster than No responses, 
and this effect was significant (F = 13-27, d.f. = 1, 7, P < 0-008). In the case of the end 
months there was a non-significant 20 ms advantage for Yes responses (F < 1, d.f. = 1, 7). 


Season verification. The results of the season verification experiment have been summarized 
in Table 6. An analysis of variance in which the 12 month names and Yes versus No 
responses were factors indicated that Yes responses were made 46 ms faster than No 
responses (F = 14-33, d.f. = 1, 7, P < 0-007), that there were differences among the months 
(F = 2-65, d.f. = 11, 77, P < 0-007), and that these two effects interacted (F = 2-13, 
d.f. = 11, 77, P < 0-03). 

Inspection of the data in Table 6 suggests that the months effect occurred because RT 
tended to increase in classification of the third member of each seasonally defined triplet, 
and that the interaction with responses occurred because this trend was more evident for 


Table 6. Mean RTs (ms) for classification of months with regard to season membership 
Central Months 
































Spring months Summer months 

Mar. Apr. May X June July Aug. X 
Yes 711 742 716 743 686 674 789 717 
No 758 755 741 752 748 753 869 791 
End months 

Autumn months Winter months 

Sep. Oct. Nov. X Dec. Jan. Feb XY 
Yes 694 690 816 734 710 715 798 741 


No 823 778 807 803 752 790 790 778 
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Yes responses than for No responses. In order to test this interpretation a further analysis 
was carried out in which the factors were: (1) responses, (2) the season of the probe month 
name, and (3) the position of the month in its season. There was no effect for seasons 

(F < 1, d.f. = 3, 21), but there was an effect for position in seasons (F = 8-82, d.f. = 2, 14, 
P « 0:004), which interacted with the response effect (F = 4-49, d.f. = 2, 14, P < 0-03). The 
other interactions were not significant, although the response x seasons interaction 
approached significance (F = 2:36, d.f. = 3, 21, P < 0-10). 

These results indicate that the months x response interaction observed in season 
verification differs in form from that observed in the temperature-growth and centre-end 
experiments. A prediction from the hypothesis of positivity and negativity of coding is that 
there should be a Yes—No effect for summer months, no effect for winter months, and an 
effect for the later months of spring and the earlier months of autumn. The 
season x response x position interaction predicted by this analysis was not significant 
(F = 1-19, d.f. = 6, 42). 

The position effect, involving an increase in RT in classification of the third member of 
each seasonal triplet, corresponds to the results obtained by Seymour (1977) for naming of 
the seasonal assignments of the months. This suggests an involvement of a common 
associative structure of month-season relationships in the two tasks. 

A further analysis of the negative RTs considered the relationship between the season 
queried and the season of the probe month. The No RT was 827 ms if the probe came from 
the season before the target season, 784 ms if it came from the following season, and 
746 ms if it came from the opposite season. The difference between these RTs was 
significant (F — 4-21, d.f. — 2, 14, P « 0:04). 


Experiment 4 


Experiment 2 suggested that operations on the semantic coding of the months were 
characterized by a boundary effect and an interaction between Yes and No responses and 
the central versus peripheral segments of the year. The results for the centre-end 
verification condition of Expt 3 indicated that these effects also occur when subjects operate 
on the structural coding of the months. However, it appears that they are not intrinsic to 
the set of month names, since they did not occur in the location verification condition of 
Expt 3, and appeared in a modified form in the seasons verification task. 

The similarity of the results obtained in the temperature-growth and centre-end 
experiments 1s evidence against the proposal that the semantic and structural 
representations of the months are dissociated. However, it could be that this similarity of 
outcome occurred because subjects used both the structural and the semantic 
representation in the two tasks. It seemed desirable, therefore, to examine some further 
conditions in which the structural divisions imposed on the months did not correspond to 
dominant semantic distinctions. The verification of half membership (January-June, 
July-December) and of quarter membership (January-March, April-June, July-September, 
October-December) appeared appropriate for this purpose. 

A further issue was posed by the boundary effect noted in the temperature-growth and 
centre-end experiments. Although it seems likely that this effect is dependent on the 
distance between an imposed boundary and the location in the series of the probe month 
we cannot exclude the possibility that retrieval of structural information occurs less rapidly 
at March-April and September-October than at other points in the series. As a test of this 
possibility a further condition was run in which subjects were presented with pairs of names 
of adjacent months under instruction to indicate whether they were correctly or incorrectly 
ordered. 
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Method 


Subjects. The subjects were 24 volunteers from classes at the University of Dundee. Eight subjects 
were assigned to each of three experimental tasks, half verification, quarter verification, and order 
judgement. 


Apparatus. This was as in the preceding experiments. . 


Procedure. All subjects received an explanation of the task followed by a period of practice. In the 
half verification task they were told that the words FIRST HALF or SECOND HALF would be displayed on 
the screen on each trial followed by the name of the probe month which should be classified by a Yes 
or No response. There were 96 trials, involving four presentations of each of 24 month x half 
combinations. In the quarter verification experiment the words FIRST QUARTER, SECOND QUARTER, 
THIRD QUARTER Or FOURTH QUARTER were displayed on each trial, followed by a probe month. There 
was a total of 144 trials, in which each month name occurred as a positive instance of its quarter on 
six occasions and as a negative instance of each of the three remaining quarters on two occasions 
each. For the order judgement task, the displays consisted of 11 pairs of adjacent month names 
printed in correct order (c.g. FEBRUARY MARCH) or in incorrect order (e.g. MARCH FEBRUARY). The 
words appeared one above the other at the centre of the screen 2 s after offset of a visual warning 
signal. Subjects were asked to respond Yes if the vertical ordering of the names was correct and No if 
it was not. Y 


Results 


Error frequencies were again found to be low in this experiment, amounting to 1-5 per cent 
in half verification, 2 per cent in quarter verification, and about 3 per cent in the order 
judgement task. The main analysis concentrated on the mean correct RTs. 


Half verification. A summary of the RT data for the half verification experiment has been 
given in Table 7. It can be seen that the change in instruction shifted the boundary effect to 
the June-July transition. There was also a Yes-No effect which appeared greater for 
months in the first half of the year than for months in the second half. 


Table 7. Mean RTs (ms) for classification of the months in terms of halves and quarters of 




















the year 
First Half 
Jan. Feb Mar. Apr. May June X 
Halves 
Yes 734 798 727 711 722 813 751 
No 862 798 788 838 860 1047 866 
Quarters 
Yes 715 753 844 759 805 836 785 
No 799 841 755 831 792 839 810 
Second half 
July Aug. Sep. Oct. Nov. Dec. X 
Halves 
Yes 896 881 737 773 793 751 805 
No 964 873 802 7719 843 793 843 
Quarters . 
Yes 809 787 789 923 759 743 803 


No 861 794 804 850 780 816 818 
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An analysis of variance of the results for the first-half months showed a significant 
months effect (F = 4:58, d.f. = 5, 35, P < 0-003), and a significant response effect of 115 ms 
(F = 14-95, d.f. = 1, 7, P < 0-007). The months x response interaction approached 
significance (F — 2-30, d.f. — 5, 35, P « 0-06), probably because the response effect was 
absent for February and exaggerated for June. The analysis of the results for the 
second-half months also showed a significant months effect (F = 6-91, d.f. = 5, 35, 

P < 0-001). However, the response effect of 44 ms was not significant (F = 1-77, d.f. = 1, 7), 
and did not interact with the months (F « 1, d.f. — 5, 35). 


Quarter verification. Table 7 also gives a summary of the results obtained in the quarter 
verification experiment. A preliminary analysis was carried out to test for a response effect 
and an interaction with the months. The Yes-No difference was only 20 ms, and was not 
significant (F = 1-01, d.f. = 1, 7). However, there was a months effect (F = 2-06, d.f. = 11, 
77, P « 0-03), and this interacted with responses (F = 2:34, d.f. = 11, 77, P < 0-02). A 
further analysis was carried out in which the factors were responses, quarters and positions 
in quarters. This showed a significant re x positions interaction, which occurred 
because RT increased as a function of pUsition for the first quarter but decreased as a 
function of position for the fourth quarter. The three-way interaction with responses was 
also significant (F = 4-77, d.f. = 6, 42, P < 0-001). Follow-up tests indicated that this was 
because the interaction of positions with the first versus third quarters occurred for positive 
responses but not for negative responses. 

These results differ from those obtained in the season verification experiment in that they 
do not show an increase in RT for the third member of each triplet. They also fail to 
reproduce the response effects observed in the other experiments. The results of the 
temperature-growth and centre-end experiments suggest that there should be a Yes 
advantage for months from the second and third quarters, but tests on these quarters gave 
no evidence of such an effect (F = 1:25 and 0:54, d.f. = 1, 7). 

The negative RTs were reclassified to allow a test for an effect of the distance between 
the quarter specified in the query and the quarter to which the probe month belonged. No 
significant distance effects were obtained. 


Order judgements. The results of the order judgement experiment are summarized in Table 
8. If the structural codes needed for discrimination of ordinal position were of equal 
accessibility or utility at all points in the months series we would not expect to observe RT 
differences among the month pairs. It can be seen that in the event the judgemental RT 
was found to vary substantially, being lower for the end and centre pairs, 
January-February, May-June, June-July and November-December than for the 
intervening pairs. This effect was significant when tested by analysis of variance (F = 4-92, 
d.f. = 10, 70, P < 0-001). Yes responses were faster by 122 ms than No responses 


Table 8. Mean RTs (ms) for classification of pairs of adjacent month names as correctly or 
incorrectly ordered 





Month pair 





Jan./ Feb/ Mar/ Apr/ May] June/ July/ Aug./ Sep./ Oct/ Nov/ 


Feb. Mar. Apr. May June July Aug. Sep. Oct. Nov. Dec. X 











Yes 876 1142 1015 1093 959 934 1104 1103 1087 1119 1008 1040 
No 1119 1172 1287 1225 1086 i 1038 1174 1192 1222 1221 1050 1162 
D 
it | 
pi 
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(F = 49-32, d.f. = 1, 7, P < 0-001), but the response effect interacted with the month pairs 
(F = 2-83, d.f. = 10, 70, P < 0-005). This seems to reflect some variation in the size of the 
effect among the end months, ranging from 30 ms for February-March to over 240 ms for 
January—February. Thus, the response x months interaction seems not to parallel that 
found in the temperature-growth and centre-end experiments. There 1s no indication of an 
elimination of the response effect when end months are classified, although the data do 
imply an influence on the judgemental process of characteristics of the individual months. 


General discussion 


This paper illustrates the application of a domain-oriented approach to the study of human 
permanent memory. The experiments concentrated on an investigation of a single 
representational system in semantic memory (the formal and associative superstructure of 
the month names). The method has the advantage of holding constant lexical aspects of the 
probe stimuli (the 12 month names) and of the responses used as output (the reports Yes 
and No) while examining properties of the internal knowledge structure. 

The principal data consisted of (1) variations ingthe RT to classify individual months (i.e. 
changes in the shape of the RT profile of the months), and (2) variations in the size of the 
Yes-No difference associated with particular months or groups of months (the 
months x response interaction). À general assumption has been that the RT profile reflects 
aspects of the structural representation of the months, whereas the Yes-No effect 1s 
sensitive to semantic factors (i.e. positivity or negativity of bipolar dimensions). 


Lexical judgements 


In the location verification task (Expt 3) the RT profile was statistically flat and there was 
no months x response interaction. It is likely that subjects transformed the number names 
into representations of month names and made their judgements at this lexical level. The 
locative and semantic superstructure of the months was not involved. Quarter verification 
and season verification could, in principle, be carried out in the same way. The subject 
might retrieve three months names from memory and utilize these ın a lexical comparison. 
However, both experiments yielded month x response interactions, and the relation between 
the target set and probe month had an effect in season verification (though not in location 
or quarter verification). 

It seems likely, therefore, that there was some involvement of structural or semantic 
codes 1n quarter and season verification. The effects were chiefly delays of Yes responses to 
probes occupying particular positions, i.e. the third position of the first quarter and first 
position of the fourth quarter, and the third position of each seasonally defined triplet. This 
could mean that the seasons are defined in terms of their starting months, and that the first 
and fourth quarters are defined in terms of the start and end of the months sequence. If 
these definitions did not extend to March or October, or to the last month in each season, 
a delay of positive reaction might occur at these points. The facilitation of negative reactions 
to probes drawn from the season opposite to the target season is suggestive of involvement 
of a structure specifying relations of months to seasons and of seasons to one another (see 
Seymour, 1977). 


Response effects 


A main argument of this paper has been that the semantic coding of the months 
incorporates a number of positively and negatively polarized dimensions 

(CHROMATIC +> ACHROMATIC), (HOT +> COLD), (GROW +> DIE). The finding that a Yes-No 
effect occurred for the central months but not for the end months in the 
temperature-growth experiment provided good support for this analysis. The results can be 
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taken as a further demonstration of markedness effects on judgemental and response 
processes, paralleling findings for the (ABOVE «+ BELOW) dimension of vertical location 
(Seymour, 1974a, 19746, 1975) and for a (MEANINGFUL <> MEANINGLESS) dimension 
(Seymour & Jack, 1978). 

Various theories may be proposed as interpretations of the markedness effect. Seymour 
(1975) considered that positive and negative elements inherent in the coding of a display 
might induce reciprocal adjustments to the thresholds of ‘same’ and ‘different’ 
decision-makers (cf. Schaeffer & Wallace, 1970) or might prime the logogen units involved 
in production of Yes and No responses. An alternative possibility, favoured by Seymour & 
Jack (1978), is that the effect 1s a result of a central semantic conflict which occurs when a 
judgemental code (SAME +> DIFFERENT) becomes active at the same time as another 
positively or negatively polarized code The effect can be viewed as a Stroop conflict (see 
Seymour, 1977) which occurs whenever the (SAME +> DIFFERENT) dimension and another 
polarized dimension display opposed values. Thus, the positivity of the central months 
would be expected to delay No responses and possibly to facilitate Yes responses, whereas 
the negativity of the end months would delay the Yes response and possibly facilitate the 
No response. Given a general bias in favour of the Yes response (see the location 
verification condition of Expt 3), this would tend to exaggerate the response effect for 
central months but to diminish it for the end months. 


Coding of location 


It was considered possible that the structural and semantic coding of the months might be 
dissociated, and that in consequence the month x response interaction might appear in 
altered or diminished form in structural judgement tasks. This expectation was not 
supported since centre-end verification produced an interaction that was very similar to 
that found in temperature-growth verification (Expt 3). This could be because the semantic 
dimensions were evoked in the centre-end discrimination task. However, this interpretation 
could not also apply to the half discrimination task (Expt 4) which showed a Yes-No effect 
for the first-half months, but not for the second-half months. 

These results could be accommodated by arguing that the locations of the months are 
defined in terms of spatial or temporal dimensions which also incorporate a positive— 
negative bipolarity. Candidates for such dimensions might be: (CENTRAL €» PERIPHERAL) 
and (EARLY «+ LATE). If we assume that (PERIPHERAL) and (LATE) are the negative poles of 
these dimensions, the reduction in the size of the Yes-No effect associated with the end and 
second-half months could be explained by the coding conflict mechanism outlined above. 
Indeed, it would then become an open question whether the effects found in the 
temperature-growth experiment depended on the polarization of the semantic dimensions or 
of the locative dimension. 


Boundary effects 


The temperature-growth, centre-end and half verification experiments all demonstrated a 
peaking of the RT profile at the transitions between segments of the months series. This 
boundary effect occurred at the March-April and September-October junctures in Expts 2 
and 3, and at the June-July juncture in Expt 4. Although the months are formally cyclic, 
an effect was not noted at the December-January transition in the half verification 
experiment. 

The boundary effect 1s closely related to the symbolic distance effect which occurs when 
subjects make judgements about the relative sizes of objects or the magnitudes of numbers 
(Banks, 1977). It has sometimes been maintained that the distance effect is characteristic of 
operations on analogue representations of magnitude (Moyer & Landauer, 1967; Paivio, 
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1975; Moyer & Bayer, 1976). This theory could be applied to the months by proposing 
that boundary regions and month locations are represented by quantities which are 
analogues of position on a continuous scale. A problem of discrimination might arise when 
similar values were compared, e.g. when a month position code was very similar to a 
boundary code. 

A main difficulty with an analogue coding theory of this kind is that the analogue 
representations appear semantically neutral. As Banks (1977) has pointed out, the analogue 
coding assumption does not predict the occurrence of semantic congruity effects (an 
interaction of item position and ‘choose larger’ versus ‘choose smaller’ instructions), 
although such effects are characteristic of judgements of digit magnitude (Banks et al., 
1976) and of object size (Banks & Flora, 1976). Similarly, an analogue coding model would 
not predict the interactions with Yes and No responses which were obtained in the present 
series of experiments. 

The alternative proposal of Banks (1977) is that items in series are assigned rather 
general codes, such as (LARGE) or (SMALL), and that distance effects arise when these codes 
prove inadequate for the discrimination of the stimuli. Thus, the choice of the larger of the 
numbers 7 and 9 is subject to delay because both are coded as (LARGE), and because 
additional information about the precise sequencing of the digits must be retrieved before 
the judgement can be made. An illustration of this process has recently been given by 
Hamilton & Sanford (1978). Their subjects judged whether pairs of letters were correctly or 
incorrectly ordered and then reported whether they had mentally run through any section 
of the alphabet before making their decision. The RTs showed a standard symbolic 
distance effect, but this was largely dependent on a rise in the probability of mental 
rehearsal of sequences of letters which occurred at reduced separations. 

An account of this kind could be applied to the months by assuming two levels of 
structural coding, one specifying general locative information and the other more precise 
information about relative positions of adjacent months. The general code, which might be 
called the /ocative code, specifies month position as a series of categories, probably 
including: (START), (END), (EARLY), (LATE), (CENTRAL), (PERIPHERAL). The second level of 
coding, the adjacent position code, specifies the ordered series of month names which map 
onto a particular locative category. The additional assumption, argued above, is that the 
locative categories are grouped on the bipolar dimensions of centrality (CENTRAL € 
PERIPHERAL) and earliness (EARLY +> LATE) and that this is the basis of their semantic 
positivity and negativity. 

According to this theory, boundary effects occur because boundary locations, other than 
December-January, cannot be discriminated by a consideration of locative codes alone. 
The December-January juncture is an exception because these months are assigned (START) 
and (END) codes which indicate that they are first and last members of a cyclic system. In 
centre-end discrimination, the codes (CENTRAL) and (PERIPHERAL) are adequate for direct 
classification, but the codes (BARLY) or (LATE) are not, and retrieval of adjacent position 
codes for March-April and: September-October produces a boundary effect. Similarly, in 
half discrimination, the codes (START), (END), (EARLY) or (LATE) are adequate but the code 
(CENTRAL) is not, and retrieval of the position code for June-July occurs and generates a 
central boundary effect. 

The order judgement task (Expt 4) can be taken as providing information about ease of 
retrieval of adjacent position codes at different points in the months series. The RT profile 
took the shape of an inverted W, indicating that access to adjacency information was 
slower for (EARLY) or (LATE) months than for (CENTRAL) months or the (START) and (END) 
months. This retrieval difficulty would be likely to exaggerate the boundary effects obtained 
at the March-April and September-October junctures. The boundary effect (defined as the 
difference in RT between the two central months of the segment and the two peripheral 
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months) was about 260 ms in the centre-end discrimination task as against just over 

100 ms in the half discrimination task (see Tables 5 and 7). 

` As was commented earlier, the proposed analysis of the locative codes makes it difficult 
to distinguish semantic judgements from structural judgements. Temperature-growth 
decisions could be based on the locative coding structure, and the bipolarity of the 
dimensions could be responsible for the months x response interaction. However, an 
account of this kind would not predict the months x dimensions interactions which were 
obtained in Expt 2. Further, inspection of Table 4 will confirm that there were places where 
the boundary effect was modified, e.g. in Yes responses to growth queries about April. It 
remains a possibility, therefore, that temperature-growth decisions were based, where 
possible, on consideration of the (WARM €» COLD) and (GROW +> DIE) structures, but that 
adjacent position codes were retrieved when these codes proved inadequate for 

the discrimination, generating the observed boundary effects. 
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Phonological recoding and the Stroop effect 


Peter Naish 





In a card-sorting version of the Stroop task ıt was demonstrated with male subjects that non-words, 
which either looked like or were pronounced like real colour words, produced slower sorting by 
colour than did non-words which did not resemble colour words in any way. For female subjects this 
effect was observable only with the pseudo-homophones. The results are accounted for by proposing 
that males and females used different reading strategies during this task, but that the differences are 
less distinctive during more normal reading situations. 





There is much evidence to suggest that a reader has two possible techniques available to 
him for the analysis of the printed words he is scanning. The theories of access to word 
meaning associated with these reading modes, together with relevant experimental results, 
have been well reviewed (e.g. Bradshaw, 1975; Meyer et al., 1974). It would seem that, 
even in the absence of available experimental data, a developmental viewpoint would 
postulate two possible strategies of printed word analysis, since when a child reaches the 
age to start reading he can already do two things associated with language: he can 
understand and produce speech, and he can put names to visually presented items. Reading 
strategies based on the first skill would presumably be economical in their use of processing 
capacity, as existing auditory analysis pathways could be used, if preceded by a suitable 
‘shape-to-sound’ converter. However, as long as a child can learn to distinguish between 
and correctly name the many words he will have to recognize, then it may be unnecessary 
to set up any new form of reading-dedicated analyser at all; the child simply names words 
as he does any other items in his environment. The description of this second technique 
carries the implication that to recognize a word shape and find its name is also to become 
aware of its meaning, since the alternative would be merely to find the name, which would 
be internally monitored and understood as if 1t were an external speech signal. Such a 
process would resemble very closely the first, in its use of existing auditory pathways. 

The two routes for reading outlined above, involving some form of translation to an 
internal representation of speech, or a direct visual recognition, will be referred to as the 
phonemic route and the visual route, respectively. The teaching of reading, for example 
phonetically, or by the ‘whole word’ technique, has followed fashions reflecting prevailing 
beliefs about the relative importance of these two pathways. Neither of these techniques can 
guarantee to induce reading by a particular route, but it may be presumed that teaching a 
child to associate sounds with letters could foster the use of phonemic processing and that 
encouragement to associate the word name with the complete string might lead to direct 
visual access. Although Rubenstein et al. (1971) suggested that the meaning of a word was 
reached solely by the phonemic route and Baron (1973) presented evidence supporting the 
claim that the reverse was true, it is now generally held that both routes are available. 
Nevertheless, there is a sense in which many researchers effectively support the concept of a 
single route to meaning, inasmuch as it is believed that one form of analysis is faster than 
the other and therefore is almost always the route which accesses the word's meaning. 
However, there does not seem to be general agreement as to which reading technique is in 
fact the quicker; for example Coltheart et al. (1977), by measuring the time required for 
subjects to decide whether a letter string was a word, decided that it was most probably the 
visual method, while Martin (1978) presenting data derived from measurements of the 
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Stroop effect, concluded that the phonological route was the most rapid. Her techniques 
and conclusions lend themselves to further examination. 

The Stroop effect (Stroop, 1935) 1s most dramatically demonstrated by preparing a list of 
colour names, printed in coloured inks, with the ink colour differing from the printed 
colour-name for each word. When a subject 1s required to go through the list, naming the 
ink colours as rapidly as possible, it ıs found that the conflicting textual and colour 
information in each word slows the subject’s responses and leads to errors. The version of 
the test used by Martin (1978) consisted of writing each word on a separate card and 
required the subject to sort the cards into piles of the same ink colour. Although a verbal 
response was not required, this technique had previously been shown (e.g., Flowers & 
Stoup, 1977) to produce an effect, i.e. packs in which the words were colour names were 
sorted by ink colour more slowly than packs of irrelevant, non-colour words. Martin added 
to this finding by showing that if, during the sorting, the subject continually said ‘Bla, bla, 
bla...’ then the colour word packs were no longer sorted more slowly than the neutral 
word packs. She concluded that in the silent condition, as the subjects sorted the cards, the 
conflicting word information was accessed phonologically; speech prevented phonological 
encoding, hence the word meaning was not accessed and no conflict occurred, so 
permitting faster colour decisions. Thus, Martin sees the Stroop effect as being brought 
about by the rapid access of a word's meaning by the phonological route, visual encoding 
being presumed to be too slow to interfere with colour naming. If this is a correct 
description of the situation, then subjects should also be slow to sort a pack of cards 
containing non-words such as BLOO and WYTE, since phonological encoding would access 
the meanings of the rea] word homophones, giving rise to the usual response conflict, 
before the ostensibly slower visual analysis could detect the inappropriateness of such a 
result. In the experiment to be reported it was decided to compare sorting times for packs 
printed with real colour words and those containing pseudo-homophones of the samé 
words. Clearly such non-words are very similar to their real counterparts, not only in 
pronunciation, but also visually. Thus, any Stroop effect detected for the second pack 
could reasonably be attributed to their visual similarity, if one were to assume that visual 
analysis might, after all, proceed sufficiently quickly to interfere with colour decisions. To 
control for such a possibility matching non-words of the form BLOD and WOTE would be 
required, which resemble physically the real words as closely as do the 
pseudo-homophones. If the two non-word types produced equal decrements in sorting 
speed, then analysis by the visual route could not be ruled out. 


Method 
Materials 


Five packs of cards were produced. In each case the cards were black and measured 12:5 x 7:5 cm; 
they carried upper-case letters 1 cm high. In the first pack the letters spelled the colour names: BLUE, 
WHITE, YELLOW, GREEN and PURPLE. The pack contained 25 cards, each word appearing five times, 
printed once in each of the five colours, thus word and ink colour were congruent on five cards, once 
for each colour. The second pack was identical to the first, except that the words were replaced with 
the pseudo-homophones: BLOO, WYTE, YELLOE, GREAN and PERPLE. The shape control pack was again 
similar, but the pseudo-homophones were replaced with the non-words: BLOD, WOTE; YELLOT, GRELN 
and PARPLE. Speed of sorting of the above three packs was to be judged against a fourth, also 
containing non-words, but similar to the former only in length and number of syllables. They were: 
FRON, MOBE, DEVORT, BLATE and STEGIN and again each appeared five times, once in each colour. A 
practice pack of 20 cards was also produced, in which the cards all contained the letters xxxx. Each 
colour was represented four times. Prior to use the cards in each pack were randomized by shuffling. 
Subjects were seated at a table, with five labels bearing the colour names placed in a semicircle in 
front of them. To familiarize them with the task they were handed the pack of xxxx cards and 
instructed to sort them by colour, as quickly as possible, into piles against the appropriate labels. 
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Table 1. Mean times (s) for sorting of the four pack types 








Pseudo- Means across 
Words homophones Shape Neutral packs 
Males 25-4 23-7 24-4 21:8 23:8 
Females 23-1 22:6 21:2 21-4 22:1 
Means across sex 242 23-1 22:8 21:6 








Table 2. Mean percentages of time required to sort each pack type, when presented in the 
four possible sequence positions 








Mean across 

First Second Third Fourth position 
Males 
Words 28.5 283 25-8 24:3 26:7 
Pseudo-homophones 27-4 24-1 24-8 229 24.8 
Shape 27-7 25:3 24:2 24-7 254 
Neutral 23.9 242 22:3 21-6 23-0 
Means across packs 26-9 25:5 24-3 23.4 
Females 
Words 28-6 25-9 25-2 24-9 26:1 
Pseudo-homophones 27-4 248 —. 24-8 25.4 25-7 
Shape 254 24-7 23-4 22-5 24-0 
Neutral 24-7 23-7 24-1 244 242 
Means across packs 26-6 24-8 24-4 24:3 


& 


They were instructed that the remaining four packs were to be sorted similarly, although they would 
not contain xs. Subjects received these four packs to sort in an order randomized across subjects and 
arranged that a given pack appeared an equal number of times in the four possible positions, when 
the data from all subjects were pooled. The sorting times were measured to 0:1 s, using a 
stopwatch. 


Subjects 


The subjects were 24 males and 24 females, technical staff, undergraduates and graduates, at the 
Universities of Oxford and Reading. The order of presentation of packs was balanced separately 
within males and females. 


Results 


Table 1 shows the mean sorting times, averaged across subjects, for the four critical pack 
types. A three-way analysis of variance (pack x subjects x sex) showed that the slightly 
faster overall sorting speed of female compared with male subjects was not significant. 
However, the effect of pack type was significant (F = 8-223, d.f. = 3,138, P « 0-001) as 
was the interaction between sex and pack types (F = 2:651, d.f. = 3,138, P < 0:05). 
Analysed in this way, although practice effects on sorting speed were balanced across 
subjects, these effects nevertheless contributed considerably to the between-subject variance. 
The data were therefore re-analysed in a second three-way ANOVA, the data being 
organized as pack types x position in sequence x sex, six subjects contributing to each cell. 
Each subject contributed data to only four cells, so, to eliminate the effects of different 
overall sorting speeds between subjects, each subject's data were expressed as a percentage 
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of the total time that the subject spent in sorting the four critical packs of cards. Thus, if 
the subject took equal lengths of time on each pack, the subject’s four scores would be 
scaled to 25-0. Table 2 shows the mean percentage times derived from this scaling 
procedure. The analysis of variance showed significant effects on sorting speed of position 
in sequence (F = 16-563, d.f. = 3,160, P < 0-001) and type of pack (F = 14-116, 

d.f. = 3,160, P < 0-001). The pack type x order interaction (F = 1-091, d.f. = 9,160) and 
the three-way interaction, pack x order x sex (F = 0:843, d.f. = 9,160) were both 
non-significant. The only significant interaction was that between sex and pack type 

(F = 3,974, d.f. = 3,160, P < 0-01). These effects were further examined, using the 
Newman-Keuls method for multiple comparisons. An examination of the order effect 
showed that packs sorted first required longer than in any other position (P « 0:01); the 
second position was slower than the fourth (P « 0-01), but none of the other sequence 
differences was significant. Packs containing words were sorted more slowly than any other 
pack type (P < 0-01) and pseudo-homophones and shape control packs were sorted more 
slowly than the neutral control pack (P « 0:01 and P « 0-05 respectively). The interaction 
data showed that males sorted the shape control pack significantly (P « 0:01) more slowly 
than the neutral pack, but not significantly more quickly than the word pack. The 
pseudo-homophone pack was sorted by males more slowly than the neutral pack 

(P « 0-025) and more quickly than the word pack (P « 0-025), but not significantly 
differently from the shape cards. Females differed from males in that they ordered the 
shape control pack more quickly than the word pack (P < 0-01), but not at a significantly 
different rate from the neutral pack, which was itself sorted more quickly than the real 
words (P « 0-025). These subjects did not sort the pseudo-homophones significantly 

more quickly than the real words, but the sorting of pseudo-homophones was somewhat 
slower than the shape control pack (P « 0-07). 


Discussion 

The data derived from female subjects support the Martin (1978) claim that it is a letter 
string's phonological representation which gives rise to the Stroop effect. They show that a 
non-word homophonic with a colour word is as damaging to a subject trying to sort by 
colour as is the colour word. Conversely, a non-word equally like the real colour word in 
shape, but differing in pronunciation, does not delay sorting any more than a non-word 
which is not shaped or pronounced like a colour word. The fact that shape control packs 
did not slow female subjects suggests either that they were not accessing word meanings 
from their printed shapes, or that females are highly discriminating in their shape 
judgements and were able to reject the non-words for being insufficiently similar to the real 
colour words. The damaging effects upon female sorting speeds of the non-word 
homophones, which are equally discriminable from the real words in terms of shape, lends 
plausibility to the suggestion that their favoured mode of reading, at least ın this colour 
situation, was via the phonological route. 

Gough & Cosky (1977) refer briefly to a similar study, in which the task of the subjècts 
was to name colours aloud rather than sort cards. Their results also suggest that subjects 
are affected by pseudo-homophones, but not by shape controls. However, a different 
explanation from the above is clearly required for the behaviour of the male subjects in the 
present experiment. Both the pseudo-homophone and the shape control packs proved more 
distracting for these subjects than the neutral control pack. Since the pseudo-homophones 
did not produce the same level of Stroop effect as did the colour words with which they 
were homophonic it must be presumed that the males did not rely upon the results of 
phonological encoding during this task. These non-words did, however, slow response time 
with respect to the neutral non-words, as did the shape controls. The position of the 
pseudo-homophones and their similarly shaped counterparts, lying between the real colour 
words and the neutral non-words in their effects on sorting speeds, may be presumed to 
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reflect the fact that their shapes too are intermediate between the neutral non-words and 
the real colour words in their similarity to the real words. Males, it would seem, were 
accessing word meanings by a direct visual route. 

Differences in reading strategies between males and females, as implicated in these 
results, have been suggested before (e.g., Coltheart et al., 1975), but 1f males and females do 
indeed generally use different analysis routes it is remarkable that clear and stable sex 
differences do not emerge in all studies of the reading process; Jorm (1979) has failed to 
replicate the Coltheart ef al. (1975) findings. Clearly, sex differences of this type are not 
robust and may well be influenced by factors such as the method used to teach the 
individual to read. Furthermore, there is some indication (Nyborg, in preparation) that 
spatial and verbal abilities, in both males and females, are influenced by hormone levels 
and that these levels are not strongly correlated with sex. There appears to be a certain 
task specificity in the manifestation of sex differences in language-related activities, and a 
speculative explanation will be put forward to account for their presence in these results. 
For the examination of reading processes a Stroop-based test is an oblique tool, since the 
subjects in fact actively avoid reading the confusing words. The version of the Stroop test 
in which colours are named aloud always demonstrates larger interference effects than do 
those in which silent card sorting, or button-pressing responses are required. A verbal 
response to a visual stimulus may resemble reading sufficiently closely to encourage the 
encoding of the interfering textual material. In the present study, in which, unlike that 
reported by Gough & Cosky (1977), silence is maintained, any textual information which is 
still assimilated during the judging of colours may follow channels other than those used 
when a subject is reading by choice. If a difference does exist, then it is likely that during 
inadvertent reading the analysis channels involved are the more basic in nature, while 
directed reading makes use of more complex techniques that have been acquired through 
practice and require conscious direction. There is some evidence (e.g. Perfetti & 
Hogaboam, 1975; Barron, 1978) that, in children, it is the skilled readers who have 
acquired a preference for the phonological route. Moreover, a number of studies have 
demonstrated that boys experience greater difficulty in learning to read than do girls (e.g. 
Rutter et al., 1976). Thus, if a situation exists in which adults revert to a more primitive 
form of word analysis, it ıs suggested that males are more likely to demonstrate the use of a 
visual route and that the Stroop test described here 1s such a situation. This form of 
explanation would also account for the results-of the letter cancellation task reported by 
Coltheart et al. (1975). Corcoran (1966) had previously reported that the pronunciation of 
a word affected the chances of a subject detecting the presence of a letter in the word. 
Coltheart et al. showed that the phonological effect was most marked in females; thus 
males and females were differentially affected by word pronunciation in a task which did 
not actually necessitate reading. The males may no longer have been using phonological 
encoding in this non-directed reading situation and, although the letter cancellation task 
was not designed to demonstrate an actual swing towards the use of a visual route, the 
males' greater accuracy in letter detection is consistent with the hypothesis that such a 
switch occurred. 


Conclusions 


The claim that during a Stroop task the conflicting verbal information is accessed via 
phonological recoding is supported in the case of female subjects by the fact that their 
responses were slowed equally by pseudo-homophones of colour words as by those words. 
The pseudo-homophones did not slow male subjects more than other non-words of similar 
shape, but different pronunciation did; so for these subjects the claim that they use a visual 
encoding route is more plausible. The form of reading taking place during this task was 
probably not representative of normal, intentional reading and may have demonstrated a 
reversion to less sophisticated techniques. 
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Recognition of word associates in semantic paralexia 


Narinder Kapur 





This study examines the processing of word associates by a patient who makes semantic paralexic 
responses in oral reading. His ability to extract semantic features from printed words was assessed 
using a task which involved the recognition of specific associates of printed words He could 
recognize aS many associate words as normal control patients, but was more likely to make a positive 
response to stimulus words unrelated to the target item. There was also a tendency to more readily 
accept same-category associates as being related to target words. It 1s suggested that semantic 
paralexic patients sometimes make inadequate ‘checks’ on related words which are activated when 
printed words are perceived, and possible retraining procedures are suggested which may improve 
their checking strategies 





Many different reading disorders are found in patients with acquired dysphasia (Marshall 
& Newcombe, 1977). A certain small group of dysphasic patients, when they are asked to 
read aloud, are particularly prone to make word substitutions that are semantically similar 
to the target words. For example, MERRY may be read as ‘jolly’, CONSCIENCE may be read 
as ‘honesty’, etc. In addition, they show a number of other characteristic patterns of 
reading performance, including relatively good reading of concrete/high imagery words and 
rather poor reading of abstract nouns, function words and nonsense syllables (Coltheart, in 
press; Marshall & Newcombe, in press). Several terms have been used to classify such 
patients, including ‘deep dyslexia’ (Marshall & Newcombe, 1966), ‘paralexia’ (Benson & 
Geschwind, 1969) and ‘phonemic dyslexia’ (Shallice & Warrington, 1975). None of these 
terms appears to be quite appropriate. The first begs the question as to what is ‘deep’ 
about the condition. The second term is too general, since the word ‘paralexia’ can refer to 
general word substitutions in oral reading. The third term has since been considered 
inappropriate by its authors (Shallice & Warrington, in press). Accordingly, the term used 
in this paper is ‘semantic paralexia’, the condition being classified in terms of its chief 
distinctive feature; remaining aspects of the reading disorder have been noted in other 
aphasic patients (Gardner & Zurif, 1975; Beauvois & Derouesne, 1979). 

The presence of such associative errors in the reading performance of semantic paralexic 
patients suggests that they are able to process some of the semantic features of words which 
they cannot read aloud correctly. The extent of this processing is, however, unclear and it 
may include a defect in extracting semantic features of printed words. The aim of the 
present investigation was therefore to examine this processing more closely. We employed a 
test (Goodglass & Baker, 1976) where patients are required to recognize acoustically 
presented associates of printed target words. Different aspects of the semantic field can be 
tested by including associates which have particular relationships to target words, e.g. 
superordinate, attribute, etc. 


Method 
Patients 


PD, the semantic paralexic patient under investigation, was born in 1943 and suffered a 
left-hemisphere cerebro-vascular accident in 1969. His clinical condition, which is mainly 
characterized by a right hemiparesis and a Broca-type aphasia, is described more fully in an earlier 
report (Kapur & Perl, 1978). A recent CT scan (1978) indicated a large low-density area in the 
distribution of the left middle cerebral artery with some adjacent ventricular and sulcal dilatation. In 
addition to PD, four control patients were tested. These were out-patients receiving physiotherapy for 
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a non-neurological condition, and they were matched with PD for age (E = 35 years) and 
educational level. 


Stimuli 
Test materials were the same as those used by Goodglass & Baker (1976), except where otherwise 
stated. Fifteen target nouns were selected and for each of these 14 associates had been drawn up, 
seven of which bore a systematic relationship to the target word and seven of which were unrelated to 
the target word. The 15 target nouns were: orange, easel, glove, crowbar, bottle, desk, accordion, 
sheep, knife, flask, cake, drum, garter, cactus and ostrich (the word ‘awning’ in the original list was 
not included here). Taking the word ‘ostrich’ as an example, the seven corresponding associates 
were: 
OSTRICH 
superordinate — bird 
attribute — feathered 
same category member ~ peacock 
functionally related verb — hide 
context noun — Zoo 
alphabetically similar word — thirst 
identity — ostrich 

In the present investigation, alphabetically similar words were introduced in view of the visual 
confusions sometimes evident in semantic paralexic patients' reading. This category in fact yielded 
very few positive responses from PD or any of the control patients, and so it will not be considered 
further in the paper. 


Procedure 

Target words were presented in upper-case letters on separate cards. Each target word was placed in 
front of the patient. He then heard the 14 stimulus words (in a predetermined, semi-random order) 
and was asked to indicate after each word whether it reminded him of the target word in any way. 
He said ‘Yes’ if he thought the stimulus word was associated with the target word and ‘No’ if the 
stimulus word ‘did not go’ with the target word. Both PD and control patients made most responses 
almost immediately after presentation of the stimulus word, and so it was unnecessary to apply any 
strict latency criterion to responses. The target word was left in view of the patient until he had heard 
and responded to all 14 stimulus words (on a subsequent test session, PD successfully named all the 
target nouns, although for the word garter he first gave the response ‘stockings’). No feedback was 
given after responses. 


Results 
Responses to associate words 


PD was able to identify correctly almost all of the associated words. He responded perfectly 
in the case of the superordinate and identity words, and made only one error (maximum 
possible — 15) in each of the remaining categories. Control patients scored at a similar 
level to PD ın the case of most associate words, but on average failed to recognize a mean 
of 3-75 same-category associates over the 15 target nouns (see Fig. 1). 


Responses to filler words 


PD made an unusually large number of false positive responses to filler words. Over the 
whole test, he made 16 such responses to nine target words in all. The control patients, 
however, made a mean of 3-5 false alarm responses to a mean of 3-75 words. There was 
relatively little overlap between the actual filler words selected by PD and by control 
patients. 

A signal detection analysis of responses to same-category associates and to filler words 
indicated similar d’ values for PD and for control patients (d’ = 2-51 and 2-56 respectively). 
PD displayed a lower criterion than controls as evident in a higher Py (A) value (P = 0:15 
and 0-03 respectively). 
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Percentage errors (false negative responses) 
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No errors No errors 
0 
Super- Attribute Same- Functionally Context Identity 
ordinate category related verb noun 
member 


Figure 1. PD’s responses to different types of associate words. LJ, control patients; Mi, PD. 
Discussion 

Our semantic paralexic patient could recognize most of the associates of a series of printed 
concrete nouns. In addition, he showed an abnormally strong tendency to accept neutral 
words as being associated with the printed target nouns. 

The notable feature of PD’s performance was his excellent recognition of semantic 
associates of printed words. This level of performance was higher than most of the 
dysphasic patients studied by Goodglass & Baker (1976), control patients in the present 
study performing at a similar level to their controls. Also, in contrast to their dysphasic 
patients and to the control patients used here, PD displayed a greater tendency to accept 
neutral words as being associated to target nouns. His behaviour suggested that he was not 
carrying out ‘checks’ on words to see if their features sufficiently overlapped those 
activated by perception of the printed target word. In the case of a few false positive 
responses, it is possible to note some indirect association between target and stimulus 
words, e.g. ‘actor’ in the case of the target word DRUM, ‘vitamins’ in the case of the target 
word SHEEP. However, in the case of other items, there does not appear to be any such 
relationship between words, e.g. ‘machine’ for ORANGE, ‘taste’ for GLOVE. There is no 
evidence that PD misunderstood the test, nor that he was responding haphazardly, as his 
false positive responses were specific to nine out of the 14 target nouns. His somewhat low 
threshold in accepting other words as associates also resulted in PD actually performing 
better than control patients in the case of responses to same-category associates. It is 
difficult to interpret this finding in view of PD’s apparent low response criterion, although 
the fact that PD correctly recognized same-category associates to five target nouns tò which 
he did not make a false alarm response suggests that the two processes were operating 
independently in PD's test performance. It is notable that a sizeable proportion of semantic 
paralexic patients' word substitutions are same-category associates to target nouns, and the 
present results would suggest that paralexic patients are for some reason more sensitive to 
these aspects of printed words. 

PD's performance in the present study suggests that paralexic responses are uttered 
because inadequate checks are made on the associate words that are activated when the 
printed target word is perceived. It would appear that in semantic paralexic patients' 
reading performance three main types of checks are relevant: semantic, grammatical and 
visual. Thus, visually (but neither grammatically nor semantically) similar words may be 
output when there is an inadequate check between the semantic and grammatical features 
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of a possible response and those of the target word. Similarly, a word which is 
semantically, but not visually or grammatically, similar to the target word may be output 
because the latter two types of checks were faulty. 

The basis of a possible retraining programme could therefore be to make a semantic 
paralexic patient more proficient at carrying out such checks, these being assumed 
normally to operate at a subconscious level of awareness but to be amenable to some 
degree of conscious control. The following, by no means exhaustive, training procedures 
are suggested for possible use with semantic paralexic patients. 


Semantic check 


A semantic paralexic patient could be encouraged to make fine semantic judgements by 
having to indicate the odd-word-out from triads such as 'repair-broken-mend' where there 
is a variable degree of semantic association between pairs of words in the triad. The patient 
could also be given practice in distinguishing between words which were visually and 
grammatically similar but semantically different. Thus he could be asked to make same- 
different judgements as to the meaning of word pairs such as ‘team—meat’, *pest-step', etc. 


Grammatical check 


One would initially have to train the semantic paralexic patient to appreciate the various 
grammatical forms, explaining the different functions of nouns, verbs, etc. It might then be 
possible to provide explicit training whereby the patient indicated the odd-word-out from 
triads such as ‘climb—write-house’, ‘but—her—and’, etc. It may also be useful to ask for 
same-different judgements, in respect of grammatical function, for word pairs which 
included ones which were grammatically different but semantically and visually related: e.g. 
*tea-eat'. 


Visual check 


Semantic paralexic patients often give word substitutions that are syntactic derivatives of 
target words. These derivatives are usually both semantically and visually similar to the 
original word, but the number of letters seldom corresponds exactly between the target and 
the offered response. One way in which the semantic paralexic patient could be made more 
aware of the letter-length difference between their response and the printed word is for him 
to indicate the number of letters in a word spoken by the examiner. This could be done 
using a matching task where the number of letters was represented by, for example, the 
number of dashes in a horizontal array. Thus, for the word ‘table’ there could be arrays of 
three, seven and five dashes. The whole procedure could be incorporated easily within a 
*shaping' paradigm such that task difficulty would be hierarchically structured. 
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Time of day and retrieval from long-term memory 


Keith Millar, Brian C. Styles and David G. Wastell 





Whilst research has been concentrated upon the influence of time of day upon short-term memory, 
little account has been taken of its influence solely upon long-term retrieval. Here, the concern is with 
access to memorized information whose initial learning has occurred long prior to, and is 
independent of, the immediate experimental setting. Three separate groups (n = 18) performed a 
semantic classification task at 09.00, 14.00 or 18.00 h. The difficulty of retrieval was varied by 
requiring classification or words having ‘high’, *medium'or ‘low dominance’ in given semantic 
contexts. The efficiency of retrieval (defined as a decreasing difference in latency between high- and 
low-dominance classification speed) was found to be greater for the group who performed at 18.00 h 
As physiological arousal is supposed to increase through the day, this result may reflect a beneficial 
effect of the higher-aroused evening state upon retrieval efficiency. The result is directly opposite to 
the impairment of short-term memory performance which occurs as the day progresses. Some 
implications are drawn for other research. 








This paper examines the potential for the circadian variation in physiological arousal to 
influence retrieval from long-term memory. Previous studies of ‘time of day’ and memory 
have been concerned almost exclusively with effects upon initial learning and the 
consequences for subsequent recall: few studies have considered the potentially important, 
indeed potentially confounding influence of arousal solely upon the process of retrieval. 

Prior research has shown that the efficiency of both immediate and long-term memory 
performance varies as a function of the time of day at which learning occurs. Typically, 
learning which occurs in the morning is associated with superior immediate recall when 
compared to that following learning in the afternoon or evening (Blake, 1967; Baddeley et 
al., 1970; Folkard et al., 1977). In contrast, long-term recall shows an advantage for 
material which was originally learned in the afternoon when compared to that originally 
learned in the morning (Hockey et al., 1972; Folkard et al., 1977). 

Such results have been proposed to reflect an influence upon learning of the circadian 
fluctuation in the individual's level of chronic arousal. Arousal is supposed to rise from a 
relatively low level in the early morning to reach a peak at around 19.00 to 21.00 h, with a 
slight dip in the early afternoon (see Colquhoun, 1971; and cf. Akerstadt, 1977; Cohen & ! 
Muehl, 1977). As there is evidence that states of higher arousal during learning may impair 
short-term, but benefit long-term retrieval (see Levonian, 1972) it has seemed plausible to 
infer that the described changes in memory efficiency through the day may be mediated by 
the diurnal change in arousal. However, empirical findings do not provide such a clear-cut 
picture. Folkard & Monk (1980) point out that the effects of high arousal upon immediate 
recall are not consistently adverse and, indeed, there is evidence that high arousal may 
benefit short-term recall (see Eysenck, 1977). 

The inconsistent influence of high arousal upon immediate recall is not the only difficulty 
to afflict the attempt to relate changes in memory efficiency as a function of time of day to 
the fluctuating state of arousal. The results of previous studies have tended to be 
interpreted solely in terms of circadian arousal effects upon the initial learning process. This 
interpretation neglects a potential confounding with a circadian arousal influence upon 
retrieval; the state of learning can, after all, only be inferred from recall performance. In 
short-term recall paradigms, retrieval follows closely upon learning and, presumably, the 
circadian state of arousal must be largely the same during both processes. Potentially, t 
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therefore, the changes in immediate memory performance through the day may be due as 
much to the influence of arousal upon retrieval as upon learning or, indeed, a subtle 
interaction between the two processes. 

The potential influence of arousal upon retrieval is not simple conjecture for both 
Eysenck (1975) and Millar (1979) have shown that loud noise (an arousing 
stimulus ~ Davies, 1968) alters the speed of retrieval from semantic memory when 
compared to control conditions. Eysenck found that continuous 80dB white noise 
facilitated the speed of recall of high-dominance words (i.e. highly familiar or probable 
words within a given semantic context) for chronically low-aroused subjects, but inhibited 
the recall speed of low-dominance material in highly-aroused individuals. Millar did not 
account for individual variation in chronic arousal and, measuring semantic classification 
latencies, found faster high-dominance word recognition ın 95dBA noise than in 70dBA 
control conditions. It is important to note that both results were unconfounded with the 
potential influence of noise upon initial learnirig and that treatment with noise interacted 
with the ‘dominance’ or, more loosely, the ‘depth’ or ‘level’ of the material to be retrieved 
from the semantic hierarchy. In other words, noise did not exert a gross effect upon 
retrieval but rather its influence was subtly determined by the nature of the material to be 
accessed 1n memory. 

In both of the studies above, the assumption that the arousing nature of the noise was 
instrumental in affecting retrieval does neglect the possibility that other distracting or 
aversive qualities of the stimulus may also have contributed to the particular pattern of the 
results. One must therefore exercise some caution when recruiting evidence from the noise 
studies in order to bolster the argument that diurnal fluctuations in arousal level may also 
produce changes in retrieval performance. However, this does not detract from the 
importance of examining the influence of time of day upon retrieval, particularly in view of 
the potential confounding ın previous studies. 

Folkard et al. (1977) have already subjected the potential problem of confounding to 
experimental examination. Their study appears to have been provoked by Hockey's 
suggestion (personal communication cited by Folkard et al., 1977, p. 46) that such 
confounding may obscure an earlier study of the time of day and memory phenomenon 
conducted by Hockey et al. (1972). Folkard et al. examined the potential influence of 
circadian arousal upon retrieval by means of a recognition memory task, namely a 
multiple-choice questionnaire administered after a short story heard at 09.00 or 15.00 h. 
Where immediate retrieval was required, correct recognition was reliably better for subjects 
who had heard the story at 09.00 rather than at 15.00 h. Long-term retrieval (1 week later) 
was, however, superior for those who had first heard the story at 15.00 h. This pattern of 
results conformed to the influence of time of day upon memory as demonstrated in 
previous studies. However, long-term retrieval did not vary reliably as a function of the 
time of day at which it occurred. Folkard et al. therefore concluded reasonably that 
retrieval was not influenced by time of day — but with two important qualifications. First, 
they suggested that the difference in physiological arousal state between 09.00 and 15.00 h 
might be too small to influence retrieval. Secondly, they proposed that a measure of the 
simple probability of correct recognition might be insensitive to changes in retrieval 
efficiency. (Note, however, that retrieval is implicit in correct recognition which indicates 
that the appropriate stimulus representation has been accessed or retrieved from 
memory — see Glucksberg, 1962). 

The conclusions of Folkard et al. seem aptly reserved. Given these reservations and other 
evidence that retrieval may be influenced by noise-induced changes in arousal it would 
seem important to re-examine retrieval as a function of time of day. As a first step, the 
sensitivity of the retrieval measure might be sharpened by considering retrieval latency 
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rather than probability: both Eysenck and Millar showed an influence of noise upon the 
speed of semantic retrieval but, as with Folkard et al., neither reported an effect upon 
retrieval probability or error-making. 

Secondly, retrieval can be made from semantic memory. This provides a convenient 
long-term store which has been accumulated over a lifetime and avoids the necessity for 
subjects to learn material for retrieval within the immediate experimental setting. Simply, 
semantically related words are considered to be stored in common categories within which 
the member words differ in their probability of retrieval as category instances. Those words 
which have a high probability of retrieval in a given category context are arbitrarily 
denoted as being of ‘high dominance’ for that category. Typically, as the category 
dominance of a word declines so the time taken to recognize or recall it increases (e.g 
Wilkins, 1971). 

Thirdly, analysis of performance can be extended to more extreme times of day in order 
to span a wider range of arousal than that sampled by Folkard et al., within the limits of 
what can reasonably be demanded from a volunteer, civilian subject panel 

Whilst it may not be valid to propose that the effects of diurnal fluctuations in arousal 
should parallel those observed in the noise studies of Eysenck (1975) and Millar (1979), a 
broad hypothesis can be formed on the basis of evidence that moderate increases in arousal 
often benefit performance efficiency (see Poulton, 1976). If arousal increases through the 
day then its beneficial effects should be most evident in the evening; retrieval might be 
overall faster or, more crucially, the difference in time taken to retrieve low-, relative to 
high-dominance material may be smaller than at other times of day. Indeed, further 
consideration shows that an interaction of time of day with the factor of dominance 
18 critical, for a simple gross difference in retrieval latency across the day could say nothing 
about retrieval aloné. As simple reaction speed tends to become faster through the day 
(Blake, 1967), overall faster retrieval at one particular time might simply derive from a 
change in basal response latency. Thus the present concern 1s with changes in the profiles 
of the latency functions which relate retrieval speed to semantic dominance. 


Method 
Subjects 


Fifty-four female members of the MRC Applied Psychology Unit subject panel were paid for their 
participation (age range 25-55 years; mean 38 5 years). Prior to experimentation, subjects were told 
only that the task was concerned with memory retrieval and no mention was made of possible time of 
day effects on performance. Subsequently, the rationale for the experiment was fully explained 


The task 


This was one of semantic classification (after Wilkins, 1971). A category name (e.g. ‘animal’) was 
presented on each tria] and followed by a test word (e.g. ' horse?"). Subjects responded positively by 
saying ‘yes’ if the test word was a member of the category, otherwise by responding ‘no’ (e.g. 
‘furniture’ — *apple?). Response latencies and errors were recorded. 


Stimulus materials 


Thirty categories were drawn from Battig & Montague's (1969) category norms along with three each 
of high-, medium- and low-dominance, single-word instances of each category. For high-, medium- 
and low-dominance words respectively, average dominance levels were 1-2, 20-92 and 33-0 (Battig & 
Montague, 1969), average word-frequency counts were 93-28, 13-39 and 11-61 per million (Kucera & 
Francis, 1967) and average word lengths were 4-9, 5 0 and 5:5 letters. Negative instances (t.e 
non-category members) were employed, of course, to prevent subjects from simply responding ‘yes’ 
on each trial. Being non-category members the dominance of negative instances could not be defined 
but their average word frequency and word lengths were 44 19 per million and 5-2 letters respectively 
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Stimulus presentation and response recording 

The category names and test words were presented in upper-case form on a Textronix Visual Display 
Unit. A microprocessor (Motorolla MP6800) stored the categories and test words and controlled the 
timing of trial events. The task was subject paced by pressing the space bar of the Textronix 
keyboard which initiated the following events on each trial. (1) a 600ms delay followed by display of 
the category name 1500ms before, (2) the presentation of the test word alongside the category name, 
(3) both the category and test word remained visible until the subject had responded and pressed the 
space bar for the next trial. Presentation of the test word triggered an electronic timer which was 
stopped by the subject's vocal response via a boom microphone and voice switch. 


Design and procedure 

Subjects were allocated to one of three, equal-sized, age-balanced, separate groups who performed the 
semantic memory task at one of three times of day. The ‘morning’ group performed at 09.15 or 

10 00 h, the ‘afternoon’ group at 14.15 or 15.00 h, and the ‘evening’ group at 18.00 or 19.00 h. 
Between-group presentation of the time of day factor precludes asymmetrical transfer effects which 
might otherwise arise from a repeated-measures design (Poulton & Freeman, 1966) 


Procedure 

Subjects were tested individually in an experimental cubicle under subdued lighting conditions to 
maximize readability of the Textronix display. There was a total of 180 experimental trials, a random 
half of which paired category-name presentation with a positive test instance, the remainder with a 
negative instance. Within positive test trials, high-, medium-, and low-dominance test instances were 
randomly equi-probable. Thirty practice trials familiarized the subjects with the simple task routine 
and requirements. Subjects were instructed to give their decision responses as quickly as possible, 
consonant with maintaining accuracy, and, while self-paced, subjects were quite uniform in working 
through the trials in some 15 min. 


Results 


Figure 1 presents the group means of each individual's median correct positive and negative 
response latencies, the positive latencies being shown as a function of the three 
word-dominance levels. The summary of the analysis of variance applied to the positive 
latency data is shown in Table 1. 

An initial examination of Fig. 1 reveals that at each time of day the latency of 
classification increases as a function of decreasing word dominance. In other words, the 
lower the dominance of a word in a given semantic hierarchy the longer the time required 
to retrieve the identity of the word and confirm its category membership. This reliable main 
effect of dominance (P « 0-001) confirms previous findings 1n semantic research (e.g. 
Freedman & Loftus, 1971; Wilkins, 1971). 

A linear trend test indicates that the relationship between classification latency and word 
dominance 1s well expressed by the linear trend coefficients 1 0 —1 (F = 293-5, d.f. = 1, 
102, P « 0-001) which account for 97 per cent of the variation due to dominance. An 
important caveat here is that this linearity should not be taken to imply an underlying 
linear relationship between dominance and classification latency at the interval scale level; 
rather the linear contrast is introduced simply to summarize the relationship between 
dominance and latency in the present data. It indicates that the effect of dominance can be 
quantified as the overall latency difference between high- and low-dominance classification 
which, in turn, is equivalent to the gradient of the latency function. This point is important 
when considering the between-group differences in recognition latency which are described 
below. 

Closer examination of Fig. 1 shows that the average latency of classification varies as a 
function of time of day; the morning group are overall slower and the afternoon group 
overall faster, but the main effect of time of day is not reliable (P > 0-1). However, the 
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Figure 1. Semantic classification latencies of high-, medium- and low-dominance category instances as 
a function of retrieval during morning, afternoon and evening test sessions. open symbols show 
latencies to negative (non-category) instances. 


Table 1. Summary table of the one-between (time of day), one-within (word-dominance) 
analysis of variance applied to the data of Fig. 1 








Source d.f. Subjects ms F P 
Between-subjects 53 2911904 00 

Á (time of day) 2 248912-00 124456-00 238 n.s. 
Subjects within groups 51 2662992-00 52215-53 

Within-subjects 108 982608-00 

B (dominance) 2 715248-00 357624-00 151-27 « 0-001 
AB 4 26224-00 6556-00 2:77 « 0-05 
Bx subjects within groups 102 241136-00 2364-08 





latency profiles clearly differ between the groups as a function of the dominance of the 
material to be retrieved (time of day x dominance interaction, P < 0-05). 

A first inspection of this interaction indicates that the effect of word dominance upon 
latency (reflected in the gradients of the latency functions) is smaller during the evening 
(low- minus high-dominance latency difference = 131 ms) than during morning (181 ms) 
and afternoon (168 ms) sessions. À comparison between the evening group's low-high 
dominance latency difference with those of the afternoon and morning groups is reliable 
(F = 4-97, d.f. = 1, 102, P < 0-05) and accounts for 48 per cent of the interaction variance. 
The comparison follows an approach described by Keppel (1973, p. 239—243) and involves 
application of a matrix of coefficients generated by the vector product of the contrast vectors 


412 Keith Millar, Brian C. Styles and David G. Wastell 


(11 — 1) for the time of day factor (C4) and (1 0 —1) for the dominance factor (Cg). The 
matrix then embodies the hypothesis that the gradient of the latency function, C, is 
smaller in the evening than in either the morning or afternoon, C4. The residual 
interaction variance fails to reach significance (F = 2:04, d.f. = 3, 102, P > 0-1) thus 
rendering unnecessary any further analytic comparisons and implying that afternoon and 
morning sessions do not differ reliably. It is perhaps useful to note that the single-contrast 
analysis above circumvents the increase in Type I error rate which may result from a 
multiple-comparison approach and that the present contrast, being orthogonal to both 
main effects, analyses only the interaction variance. 

The important interaction above can thus be summarized as follows: the gradients of the 
latency functions indicate that the influence of retrieval dominance upon classification 
latency is smaller in the evening than during the morning or afternoon sessions. In other 
words, in the supposed higher-aroused evening state the latency differential in retrieving 
high- as distinct from low-dominance semantic material is reliably reduced relative to 
performance at other times of day. 

For completion, negative response latencies are shown by open symbols in Fig. 1. 
Analysis reveals that the groups do not differ reliabily on this measure (main effect of time 
of day: F « 1-0). 

Errors, i.e. misclassifications of test words, account for an average of only 3-9 per cent of 
all trials. Error-making does not vary reliably as a function of time of day; neither the main 
effect nor interaction of that factor with word dominance or response class 1s significant 
(both F « 1-0). 

Discussion 

The results indicate that retrieval latency in a long-term semantic classification task does 
vary as a function of the time of day at which retrieval occurs. This effect can be related to 
the diurnal change in arousal only by inference for no empirical measure of the arousal 
state of present subjects was taken. However, given that the inference is plausible then the 
results seem quite consistent with prior evidence that moderate increases in arousal can 
benefit performance efficiency (Poulton, 1976). In the present case, the supposed increase in 
arousal through the day is associated with an improvement in ‘retrieval efficiency’; 
increasing efficiency being described here as a decreasing difference in the time taken to 
access high- and low-dominance material. 

It must be acknowledged that the present definition of retrieval efficiency may present a 
simple view of the complex and uncertain processes involved in the semantic classification 
task and in the production of the dominance effect. The task also requires decision and, 
arguably, the factor of time of day may interact with the latter component rather than with 
retrieval. For instance, low arousal might cause slowing in decision speed, particularly on 
less familiar, low-dominance words. 

However, three points may cast doubt upon the possibility that the present results are 
due purely to arousal influence upon decision. First, the medium- and low-dominance test 
words were not chosen to be ambiguous category members which might form the basis of a 
difficult decision (e.g. words lying on category boundaries). All were relatively well known 
but, of course, relatively 1nfrequently occurring example of the categories in question. 

Secondly, the morning and afternoon groups did not make proportionately more errors 
than the evening group when classifying the less dominant words. Arguably, if arousal did 
affect semantic decision-making then an influence upon accuracy might occur. For 
instance, sleep deprivation has been associated with increased decision errors on the simple, 
if repetitive, serial-choice task (e.g. Wilkinson, 1963; but cf. Corcoran, 1962). 

Thirdly, the groups did not differ reliably in their negative decision latencies. Negative 
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instances are by definition ultra low-dominance for any category with which they are 
paired: if the low-dominance decision took longer in the low-aroused state then one might 
have anticipated reliably longer negative latencies by the morning and afternoon groups. 
As it is, the insignificant differences in latency between the groups can probably be ascribed 
to slight changes in basal reaction time through the day (e.g. Blake, 1967). 

These points may encourage confidence in the proposal that the present results do 
reflect an influence of time of day upon the retrieval process. Moreover, the fact that 
the effect was found in the early evening, and was revealed by a retrieval latency measure, 
vindicates the hesitation of Folkard et al. in drawing unqualified negative conclusions from 
their measure of retrieval probability which was applied only during morning and 
afternoon sessions. In this context, it is also notable that Folkard & Monk (1980, Expt 2, 
available after the present study had been run) have again failed to show an influence of 
even more extreme times of day upon the probability of correct long-term recall and 
recognition of material presented in a medical training film. Retrieval occurred at 04.00 and 
20.30 h in order to sample the supposed minimum and maximum levels of arousal due to 
circadian variation (Colquhoun, 1971). 

The negative findings of Folkard et al. (1977) and Folkard & Monk (1979) would seem 
to provide quite conclusive evidence that time of day does not affect long-term retrieval 
probability — at least for their tasks and material. Their negative findings may have two 
implications for the present positive results. First, it may be that only a retrieval /atency 
measure has sufficient sensitivity to detect a circadian influence upon retrieval. Secondly, it 
may be important to account for the ‘difficulty’ or ‘depth’ of the retrieval requirement, for 
the present effect was evident only from the contrast in time taken to access matenal of 
different dominance; there was no overall effect upon retrieval speed. 

The present results contrast with time of day effects upon short-term memory where 
performance efficiency declines across the day. As noted above, it is unclear from such 
studies whether the circadian fluctuation in arousal adversely affects initial learning or 
retrieval. Obviously, one cannot generalize from the present long-term retrieval effects in 
order to explain the basis of the short-term decrement. For instance, 1t would be fatuous to 
propose that as long-term retrieval efficiency appears to increase through the day, the 
short-term memory deficit must arise from an adverse effect of increasing arousal upon 
learning. Nevertheless, the present demonstration that time of day may selectively influence 
one component of memory — in this case long-term retrieval, unconfounded by an influence 
upon initial learning — may serve to emphasize the importance of explaining more fully the 
precise nature of the short-term deficit. 

As a final practical, but important point, the present results also have implications for 
the conclusions to be drawn from many semantic memory studies. As in most experimental 
studies, it is administratively convenient to test subjects during the normal working day, i.e. 
in the morning or afternoon. The present results show that this routine overestimates the 
effect of dominance upon latency relative to that which may obtain at other times. Thus, 
the results of many such studies should not be regarded as providing any absolute picture 
of the functioning of semantic memory; the observed effects are entirely specific to the time 
of testing. 
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Memorizing facial identity, expression and orientation 


Gail J. Walker-Smith 





Recognition for facial identity, expression, and orientation was investigated in a successive face 
comparison task. Subjects were required to make same/different judgements about pairs of face 
photographs that could differ in any one of these respects. Overall recognition performance for 
identity alterations was superior to that for expression and orientation changes. After a short 
retention interval (1 s) there was no difference between recognition accuracy for different responses to 
identity, expression and orientation alterations, but after a long delay (20 s) some expression and 
orientation information was forgotten while accuracy for identity judgements remained unchanged 
Subjects could remember some expression and orientation information over a 20 s period, but 
memory for these dynamic attributes was less durable than identity memory. 





Possibly the most important property of a face is its identity. People are highly skilled at 
recognizing facial identity (e.g. Scapinello & Yarmey, 1970; Goldstein & Chance, 1971) and 
recognition performance is surprisingly accurate even after various transformations, such as 
alterations in: pose orientation (e.g. Laughery et al., 1971); expression (e.g. Galper & 
Hochberg, 1971); size (e.g. Patterson, 1978) and age (e.g. Gombrich, 1972, p. 9). Moreover, 
facial memory is remarkably accurate after extremely long retention intervals (e.g. 
stretching over several decades (Bahrick et al., 1975)). Other properties of a face, for 
example the pose orientation and expression, are integral parts of the stimulus. However, 
although these aspects of a face may be perceived, a parsimonious recognition system might 
be able to extract and store identity information and neglect the less salient dynamic 
properties. 

The purpose of the present research was to investigate perception and memory for facial 
identity, expression, and pose orientation. The aim was to assess whether expression and 
orientation are perceived as accurately as identity and to determine whether the time course 
of memory 1s similar for these three attributes. It was also of interest to discover how 
expression and orientation interact with identity. 

The few studies that specifically address the question of whether orientation is important 
in face recognition suggest that it is not a stimulus attribute that greatly affects facial 
memory. Laughery et al. (1971) reported no significant differences in recognition accuracy 
between searching for a target face — originally presented in four candid views — in a series 
of photographs of front view faces, left profiles, right portrait views, and left portrait views. 
Davies et al. (1978) demonstrated that identification accuracy was similar for full-face 
photographs and ł profile photographs. Furthermore, altering the orientation of the faces 
from full face to 1 profile (and vice versa) between study and test was not detrimental to 
recognition accuracy. 

These data do not necessarily imply that subjects did not encode pose position or were 
unable to remember it. In fact, it has been demonstrated empirically that orientation 
changes between study and test phases of face recognition tasks can, in some cases, have 
adverse affects on recognition accuracy, which suggests that some orientation 1nformation 
can be stored. For example, Patterson & Baddeley (1977), like Davies et al., showed that 
after studying face photographs in a full-front view, a change to 3 profile had no significant 
effect on recognition accuracy, but a change to a profile view did reduce accuracy. The 
latter result was corroborated in a study by Seamon et al. (1978). In an experiment where 
subjects were shown full faces or profiles and were later required to select either 
the previously seen eyes, nose or mouth, performance was significantly better when the pose 
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was unaltered. Moreover, when the pose was unaltered, recognition performance was better 
for the profile than the front view. 

The above studies demonstrate some effects of orientation only where pose alterations 
consititute a major stimulus change. Whether or not anything more than gross pose 
position information is registered and remembered is uncertain. 

Ellis (1975) states that ‘there 1s no evidence to suggest that the expression of a face has 
much bearing on the memory processes underlying facial recognition’ (p. 413). But there is 
empirical evidence which demonstrates that expression does have some effect on face 
memory. Galper & Hochberg (1971) showed people photographs of faces that were either 
smiling or unsmiling. Five days after learning the set of faces, subjects were given a 
two-item forced-choice recognition test, where the ‘old’ photograph had to be selected and 
a ‘new’ one rejected. Performance was significantly poorer in a condition where decoy faces 
were the same as the targets but with alternative expressions, than where the decoys were 
different faces altogether. But, performance was better than chance in the altered expression 
condition, implying that expression was remembered. Unfortunately, from the reported 
data it is impossible to deduce whether there was differential recognition performance for 
faces originally seen smiling as opposed to unsmiling. Sorce & Campos (1974) also 
concluded that * perception of facial expression 1s a parameter of facial recognition' (p. 79) 
because they found that the greater the intra-subject discrepancy between two successive 
sets of rating scores on both pleasant/unpleasant and sleep/tension dimensions for a set of 
faces, the poorer the recognition performance for these stimuli. 

The evidence reviewed so far demonstrates that both pose and expression information 
can be stored. But, if pose and expression are not as important as identity, does the 
memory representation for a face maintain pose and expression information in a form as 
durable as identity information — possibly retaining the integral nature of these 
attributes — or does differential forgetting occur whereby identity is remembered better? 

An experiment by Walker-Smith (1978) suggests that possibly facial expression is 
forgotten quite rapidly. Recognition performance for single Photo-fit faces was found to be 
significantly worse after a 20 s interstimulus interval than after only 1 s, and both the 
mouth and eyes, but not the hair, nose or chin, were recognized more accurately after the 
short retention interval than after the long one. Although this result was due to a change in 
feature identity, Walker-Smith interpreted this interaction in terms of a loss of information 
about facial expression. Over a short delay it would be useful to store specific arrangements 
of these features so that changes in expression could be monitored. Yet over a longer 
period it would be sufficient merely to store information required for face identification — in 
which case, configurational nuances would be forgotten. 

The present experiment was designed to test predictions derived from the Walker-Smith 
(1978) investigation, concerning the effect of the length of inter-stimulus interval on 
recognition of identity, expression and orientation. Using a successive face recognition task, 
performance was tested after a short and long delay, for faces that could differ either by 
identity (different person), expression or orientation. Firstly it was expected that 
recognition performance would be superior for identity alterations rather than for 
alterations of expression or pose. Although identity seemed to be the most salient attribute, 
there was no firm a priori reason for predicting a difference in performance between 
expression and orientation. A second major prediction was that recognition performance 
for identity, expression and orientation would interact with delay. More specifically, it was 
predicted that delay would harm expression and pose memory more than identity memory. 

Additional aims of this study were to determine whether some expressions or poses are 
easier to remember than others. 

There is empirical evidence which demonstrates that attractive faces are remembered 
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better than less attractive ones (e.g. Cross et al., 1971). Also, faces rated as either high or 
low in attractiveness and pleasantness are remembered better than those in the middle of 
the range (e.g. Peters, 1917; Shepherd & Ellis, 1973). Consequently, if attractiveness and 
pleasantness are equated with smiling faces (see Ellis, 1975, p. 412), one might expect that 
memory for a ‘neutral’ face would be worse than that for a ‘happy’ (smiling) or ‘sad’ 
(unsmiling) face. 

It is possible that an abstracted face representation exists in a standardized form and 
various suggestions for a prototype face could be made. With respect to a single facial 
attribute — orientation — the prototypical representation might be the full-front view since 
this pose is often considered, albeit tacitly, to be the standard face orientation, 
(Consequently, when face photographs accompany documents, e.g. passports, usually a 
front pose is required.) Alternatively, a portrait view might be a better prototype because it 
gives more information about contours. It is difficult, therefore, to provide any a priori 
reasons for predicting that any particular orientation should be better remembered than 
any other. However, it is assumed that as the orientation difference between two 
photographs of one person's face increases, the similarity of the stimuli will decrease. Thus, 
it is predicted that where the difference in pose orientation within trials is large, subjects 
will be able to respond ‘different’ more quickly, more accurately, and more confidently 
than where the orientation difference is small. 


Method 

Subjects 

Ten female members of the Oxford Subject Panel, whose ages ranged from 18 to 31 years, were paid 
for participating in this experiment. 


Materials 


The experimental stimuli comprised a set of 27 black-and-white photographs of male Caucasian faces. 
This set was composed of nine photographs of three people of similar age who had short dark hair. 
Each person’s face was simultaneously photographed from three directions, once when the 
expression was ‘neutral’, once when it was ‘happy’ (smiling), and finally when a ‘sad’ expression was 
posed. One set of three expressions was photographed from a full-front position, the other two were 
of right-facing portrait views taken at approximately 14° (turn) and 30° (portrait) from the front (see 
Fig. 1). The photographs were centred on white cards and the neck and shoulders were masked by 
white paint. The heads measured approximately 60 by 45 mm. When displayed in a tachistoscope 
they subtended approximately 7° by 5° of visual angle (equivalent to viewing a real head at a distance 
of about 2 m). A similar set of 27 stimuli was made using three other people's faces, to be used as 
practice stimuli. 

A further tachistoscope card was fabricated. This was a pattern mask which consisted of a mosaic 
made from parts of a face picture which had been cut into small pieces and scrambled. 


Procedure 


Subjects were run individually in four | h sessions. There were two experimental conditions which 
were blocked (short delay and long delay), so subjects received two sessions with each condition. The 
session order was counterbalanced. 

Subjects viewed stimuli through a three-field tachistoscope. Each session was started with a 
practice run of 20 trials and then 81 experimental trials were administered. Over the two experimental 
sessions for the same condition, there were 81 trials where the pairs of faces were identical and 8i 
trials where they differed. The non-identical face pairs comprised 27 different cases where the 
expression and orientation were similar but the identity of the faces altered (nine each of: Face A 
with Face B, Face A with Face C, and Face B with Face C), 27 pairs where identity and orientation 
were unchanged but the expression altered (nine each of: ‘neutral’ with ‘happy’, ‘neutral’ with ‘sad’. 
and ‘happy’ with 'sad"), and 27 pairs where the identity and expression were unchanged but the 
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Figure 1. Examples of stimulus pairs. Top: expression change — ‘happy’, full, Face A, and ‘sad’, full, 
Face A; middle: orientation change - turn, ‘neutral’, Face C, and portrait, ‘neutral’, Face C; lower: 
identity change — Face C, ‘neutral’, full, and Face A, ‘neutral’, full. 
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orientation altered (nine each of: front with turn, front with portrait, and turn with portrait — see Fig. 
1). There were equal numbers of same and different face pairs in each condition. The stimulus order 
was randomized and so was the presentation order of stimuli within different tnals. 

After the experimenter had warned the subject of the commencement of a trial, the first face of a 
pair (the target) was presented for 2 s. The target was immediately followed by the pattern mask 
which remained on the screen for 1 s; its purpose was to prevent the use of iconic information in the 
short-delay condition. In the short-delay condition the next stimulus (the test face) directly followed 
the mask. In the long-delay condition the screen remained illuminated but blank for 19 s before the 
test face was displayed and, 17 s after the mask offset, a warning tone was sounded for 1 s. The test 
face was exposed for 1 s and was again immediately followed by the mask which was displayed for 
1 s. This second presentation of the mask was designed to indicate the end of the trial; it generally 
followed the subject's response and presumably did not affect performance During the long delay, a 
verbal interference task was introduced in order to prevent verbal rehearsal. Subjects were required to 
count backwards in threes from a three-digit number that was read to them by the experimenter at 
the beginning of the inter-stimulus interval The counting terminated when the warning tone sounded. 

The subjects were instructed to respond ‘same’ to a test photograph that was identical to the 
target and 'different' otherwise. They were told to respond as quickly and as accurately as they could 
by pressing appropriately labelled response switches which stopped a digital timer started by the 
onset of the test stimulus. For half the subjects the ‘same’ key was on the nght and the ‘different’ 
key was on the left, while the other half had the opposite arrangement. 

Subjects were informed that on about half the trials the second face would be identical to the first, 
while for the other trials the second face would alter by either an identity, expression, or orientation 
change. After subjects had made their same/different judgement they were asked to indicate how 
confident they were that their judgements had been correct (1 — confident, 2 — fairly confident, 3 — not 
very confident, 4 — guess). 


Design and analyses 


Correct reaction times, confidence ratings for correct responses and error rate were the dependent 
measures recorded in this study. Separate analyses of variance were performed for each type of data, 
and responses for same and different stimulus pairings were also considered independently. 

In the main set of analyses only different responses were analysed. The independent variables — type 
of alteration between stimulus pairs (identity, expression, orientation), and delay (short, long) - were 
factorially combined in a 10 (subjects) x 3 x 2 randomized blocks design. 

Similar 10 x 3 x 2 randomized blocks designs were used to investigate the effects of presenting 
particular identities, expressions and orientations. In one set of identity analyses, comparisons were 
made between trials where both face stimuli were either Face A, Face B, or Face C. In these three 
groups the difference between stimulus pairs were either expression or orientation alterations. Two 
analogous sets of analyses were carried out for expression (‘neutral’, ‘happy’, ‘sad’) and orientation 
(full, turn, portrait). Another set of analyses was used to investigate the effects of altering stimuli by 
particular changes of identity, expression or orientation. For the identity set of analyses the three 
types of stimulus pairings were: Face A with Face B, Face A with Face C, and Face B with Face C 
The expression pairs were: ‘neutral’ with ‘happy’, ‘neutral’ with ‘sad’, and ‘happy’ with ‘sad’. And 
the orientation pairs were: full with turn, full with portrait, and turn with portrait. 

Finally, same responses were considered 1n a set of analyses where identity, expression, orientation, 
and delay, were factorially combined in a 10x 3x 3x 3 x 2 randomized blocks design. 


Results 
Analyses of different responses 
Results for the main set of analyses are shown in Table 1. 

The main effect of the type of alteration between pairs of stimuli was significant for 
confidence ratings, F = 8-07, d.f. = 2, 18, P < 0-01; reaction time, F = 10-52, d.f. = 2, 18, 
P < 0-001; and errors, F = 5:26, d.f = 2, 18, P < 0-05. Pairwise comparisons using 
Newman-Keuls tests (Kirk, 1968 — all subsequent pairwise comparisons are made using 
this test) show that performance for detecting changes in identity was better than that for 
expression or orientation alterations: confidence and reaction time, P < 0-01; errors, 

P « 0:05. There was no difference between performance for expression and orientation. 
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Table 1. Performance scores for different stimulus pairs 














Mean confidence Mean reaction 
rating Total error rate? time? 
Type of alteration Short Long Short Long Short Long 
Identity 100 107 4 (1-48) | 9 (3-33) 592 721 
Expression 1-03 1:36 20 (7-41) 52 (1926) 658 880 
Orientation 1-01 1-37 9 (3:33) 45 (16:67) 640 854 





Note. Scores are shown for short and long inter-stimulus intervals. 
a Numbers in parentheses indicate percentage errors. 
è Reaction time is shown in ms. 


Performance in the recognition task was also significantly better after the 1 s retention 
interval than after 20 s delay: confidence ratings F = 10-65, d.f. = 1, 9, P < 0-01; reaction 
time, F = 24-97, d.f. = 1, 9, P < 0-001; and errors, F = 50-97, d.f. = 1, 9, P < 0-001. 

Finally, the interaction between type of alteration and delay was significant for both 
confidence ratings, F = 6-20, d.f. = 2, 18, P « 0-01, and errors, F = 7-20, d.f. = 2, 18, 

P « 0:01, though not for latency. Thus, at the short delay performance for the three types 
of judgement was similar but, as predicted, after the long delay memory for expression and 
orientation was harmed more than identity memory. There was no significant difference 
between performance at the two delays for identity judgements but for expression and 
orientation differences, accuracy and confidence scores were worse after the long delay, 

P « 0-01. 

The six sets of separate identity, expression and orientation analyses of variance gave 
little evidence to suggest that particular identities, expressions or poses were easier to 
remember than others. There were a few significant sources of variation, but no effects 
obtained consistently across the three measures (accuracy, confidence, latency). There are 
thus no clear patterns which merit discussion. 


Analyses of same responses 


Performance scores for same stimulus pairs are presented in Table 2. 

For the same stimulus pairs data, the only significant main effect was that of delay. After 
the short delay confidence was greater, F — 23:17, d.f. — 1, 9, P « 0:001, and reaction time 
was quicker, F = 19-74, d.f. = 1, 9, P < 0-01, than after the long delay; the error score 
difference, however, was non-significant. 

Accuracy measures yielded a single significant interaction — for orientation by delay, 

F = 7:27, d.f. = 2, 18, P < 0-01. There was no difference in recognition for full, turn or 


Table 2. Performance scores for same stimulus pairs 











Mean confidence rating Total error rate? Mean reaction time^ 
Short Long Short Long Short Long 
1:04 1-28 37 (4-57) 71 (8:77) 615 803 








Note. Scores are shown for short and long inter-stimulus intervals. 
* Numbers in parentheses indicate percentage errors. 
è Reaction time is shown in ms. 
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portrait poses at the short delay, and accuracy for the turn faces was similar at both delays. 
However, for the full (P « 0-01) and portrait (P « 0-05) poses, accuracy fell after the long 
delay, and accuracy for the full face was worse after the long delay than accuracy for the 
turn pose at either delay, P « 0-01. 

The latency data gave a significant identity by expression by delay interaction, F — 3 14, 
d.f. — 4, 36, P « 0-05. Response latencies were longer after the long delay except for the 
‘neutral’ and ‘sad’ expression of Face A and the ‘neutral’ expression of Face C, where the 
differences were not significant. 

Finally, for confidence ratings there was a significant identity by expression interaction 
F — 2:96, d.f. — 4, 36, P « 0-05. The only significant pairwise difference was that 
confidence was greater for the ‘happy’ Face A than the ‘neutral’ Face B, P « 0-05 

Additional analyses for same and different responses showed that there were no 
significant differences between the first and last half of sessions which suggests that 
performance was not subject to within-session interference from repeated exposure to the 
same small set of faces. Across sessions, the only significant effect was that latencies tended 
to decrease with practice. 

Discussion 

Judgements about whether pairs of successively presented photographs were identical or 
not proved to be quite easy. In fact, overall, less than 10 per cent of the total responses 
were incorrect (same errors: 6:67 per cent, different errors: 8:58 per cent). Even after the 
long retention interval accuracy was well above chance level. This suggests that both 
expression and orientation, together with identity information, were encoded and could be 
stored. 

The data for different decisions show that the three types of face information manipulated 
in this experiment — identity, expression, and orientation — were not equally salient. As 
expected, responses to identity changes were faster, more accurate, and were made more 
confidently than those for expression or orientation alterations, while no significant 
difference was found between expression and orientation judgements. This was not a 
surprising outcome since expression and orientation changes, within trials, were more 
subtle than the gross identity differences. 

Over the 20 s retention interval face recognition performance deteriorated — response 
accuracy and confidence fell and latencies increased. This result corroborates 
Walker-Smith's (1978) findings where, after a 20 s delay, there was a similar performance 
decrement ın a Photo-fit face recognition task. 

The effect of delay interacted with the type of stimulus alteration for confidence and 
accuracy measures but not latencies. As predicted, the delay had a greater detrimental 
effect on expression and orientation recognition than on identity memory. Forgetting of 
expression and orientation occurred over the 20 s retention interval, but accuracy (and 
confidence) on identity judgements was not significantly affected by the long delay. Hence 
memory for expression and orientation was more vulnerable to delay than identity 
information. It cannot be argued that the expression and orientation differences were just 
too difficult to discriminate because, after the short delay, accuracy (and confidence) for 
recognizing expression and orientation changes was no poorer than for identity alterations. 
Thus the subjects could perceive the differences in expression and orientation but could not 
preserve the information about them as well as they could about identity. The possibility 
that the data merely demonstrate that gross alterations (i.e. identity differences) are easier 
to detect over time than small changes should not be overlooked. It seems reasonable to 
suppose that at the short delay ceiling effects in the confidence and accuracy data might 
mask an effect of delay on identity judgements which is similar to that on expression and 
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orientation judgements. However, the latency results suggest that this interpretation 1s 
unlikely, Although the interaction did not reach significance for the latency measures this 
can be accounted for by the behaviour of two subjects whose response times deviated from 
the general pattern of results. An analysis of variance performed on the latency data from 
the remaining eight subjects gave a significant interaction between delay and type of 
stimulus alteration which corroborated the error and confidence findings. 

The interaction between type of alteration and delay suggests that in comparison with 
the more permanent and more important identity information, detailed memory for facial 
configurations, which includes expression and orientation information, is forgotten 
relatively rapidly from a post-iconic store. This differential forgetting suggests that the 
stored face representation is not preserved as an integrated unit. For, if it were, then if 
identity information could be retrieved, expression and orientation would also be available 
and a main effect of delay might be expected but not an interaction between the type of 
stimulus information and the retention interval. Possibly each type of information is stored 
in a similar manner but ‘trace strength’ is greater for identity. An alternative explanation 
for the differential forgetting is that although identity information appears to have little, if 
any, verbal component in memory (Walker-Smith, in preparation), expression and 
orientation coding may have a verbal component which is adversely affected by the verbal 
interference task. 

The similarity of recognition ability for expression and orientation could be an 
experimental artifact because under normal circumstances it might be useful to remember 
facial expression but not orientation, whereas here, equal stress was laid upon memorizing 
both attributes. It might also be expected that recognition performance for expression and 
orientation would be better in this experiment than in 'real life' because subjects were told 
that they were required to remember expression and orientation. However, this conclusion 
is not necessarily justified because other differences between ‘real life’ and laboratory 
studies need to be taken into account. For example, it could be argued that expression is 
important to memorize because one needs to infer behavioural consequences from 
information gained from a person's expression. In the laboratory experiment, though, the 
stimuli were only photographs so the subjects might have been less motivated to attend to 
the expressions and to memorize them. 

There is empirical evidence to suggest that some faces are more difficult to remember than 
others (e.g. Goldstein & Chance, 1971; Shepherd & Ellis, 1973; Goldstein et al., 1977). 
Therefore in order to determine whether any of the faces selected for this experiment were 
particularly easy to memorize, recognition performance for each face was examined. The 
result indicated that for different stimulus pairs, recognition accuracy and latency was 
similar for each face and no single combination of pairs of different identities was more 
difficult to distinguish than any other. The only suggestion that any particular face was 
easier to process than any other came from confidence scores for different trials where 
responses to Face A were made more confidently than those for Face B and, for the same 
data there was a three-way interaction between identity, expression and delay. Overall, 
though, the data suggest that the three faces were relatively well matched. 

The prediction that ‘neutral’ faces would be remembered less well than ‘happy’ or ‘sad’ 
faces was not supported by the results from this experiment. The different analyses failed to 
show any significant differences in recognition performance for any of the posed 
expressions and there was no main effect of expression in the same data. The three-way 
interaction between identity, expression and delay for same trial latencies also failed to 
distinguish between processing speed for each expression at either delay. However, the only 
significant pairwise difference in the identity by expression interaction for same trial 
confidence ratings was in the predicted direction — confidence for the ‘happy’ Face A was 
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greater than for the ‘neutral’ Face B. Possibly viewing the three faces in each expression 
reduced the likelihood of finding an effect of expression in this face recognition task. Since 
each expression was familiar, no single face should have acquired one particular affect label. 
Therefore, perhaps memory for a particular expression is only found to be better than any 
other when viewers see just one sample of an expression in a to-be-remembered face. 
Recognition performance was not affected by the orientation of the stimuli when both 
the faces within different trials were photographed from the same direction. This suggests 
that the turn and portrait faces were no more difficult to recognize than the frontal views. 
The a priori predictions concerning orientation, namely that bigger changes would be more 
readily detected, were not supported. It should be remembered, however, that only large 
changes in orientation (such as full-face to profile) have affected performance in previous 


studies, and the biggest change employed here was 30°. 

In summary, the results of this experiment demonstrate that over a 20 s retention 
interval, where verbal rehearsal is prevented, facial identity, expression, and orientation can 
be remembered. However, although the delay has no effect on identity retention, 
recognition performance for expression and orientation is harmed. Since identity is 
arguably a more important facial attribute than expression and orientation it would seem 
useful for an efficient recognition system to retain the more permanent information longer 
than the dynamic facial attributes. Finally, there was no strong evidence to suggest that 
specific expressions and orientations were particularly easy to remember. 
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Sex and ethnic differences on a spatial-perceptual task: Some hypotheses 
tested 


Gustav Jahoda 





The most prevalent version of the environmental hypothesis, by Sherman (1967), would predict a sex 
difference in spatial performance in Scotland but not in Ghana. With regard to the particular task 
used, namely 3D mental rotation, the same prediction would arise on the basis of Serpell's (1979) 
‘specific experience’ hypothesis. Subjects were 40 boys and 40 girls in both Ghana and Scotland, 
equated for years of education. The results showed a sex difference of the same magnitude in both 
cultures, thereby throwing doubt on purely environmental interpretations. The findings also clearly 
argue against a genetic hypothesis put forward by Jensen (1975). 





In a previous paper (Jahoda, 1979) ıt was argued that ‘spatial ability’ is not, as 1s 
frequently implied, a single homogeneous entity capable of being assessed equally well by 
means of quite different tests or tasks. It was further suggested that this fact might account 
for the numerous inconsistencies and contradictions to be found in the literature. The study 
itself demonstrated that ethnic and sex differences may be a function of the nature of the 
task. Thus for cube construction from a 3D model there was a sex, but no cultural 
difference; when the task involved a 2D mental rotation there was a cultural, but no 

sex difference. It had been hoped to include also a 3D rotation of the kind in which sex 
differences are typically found. The difficulty of elaborating a task suitable for 
cross-cultural work proved greater than had been anticipated, so that it was not ready in 
time for the earlier field trip. Subsequently it was possible to conduct a separate study 
employing 2D representations of 3D objects. 

However one may conceptualize ' spatial ability', there is no doubt that the 
transformation of mental images constitutes a key element of it. This does not merely 
apply to cultural but also sex differences for, as McGuinness & Pribram (1978) pointed out, 
males seem to be relatively better at rotating visual images into new planes. Hence such a 
task appears eminently suitable for examining hypotheses intended to explain performance 
differences. Moreover, a design that looks at sex differences within contrasting cultural 
contexts permits a scrutiny of some hypotheses that could hardly be tested intra-culturally. 

The literature on the factors influencing ‘spatial ability’ has been critically and 
exhaustively surveyed by Harris (1978). Although he was concerned mainly with sex 
differences, the same considerations apply to cultural/ethnic ones. His major categories of 
possible determinants, namely genetic, neurological and environmental, therefore apply 
both to cultural/ethnic and to sex differences, although the categories are of course not 
mutally exclusive. 

As far as genetic determinants are concerned, it has been claimed ın the past that ‘spatial 
ability’ 1s likely to be a sex-lined recessive character (Hartlage, 1970; Bock & Kolakowski, 
1973) but more recent work indicated that the mode of inheritance 1s more complex (De 
Fries et al., 1976; Loehlin et al., 1978; Park et al., 1978). In the light of the preceding 
comments, it may be that such divergent views result in part from varying modes of 
assessing 'spatial ability’. The search for the ‘spatial gene’ via ‘pure tests of spatial ability’ 
is perhaps misguided. This has been explicitly recognized by some workers, such as 
Guttman (1974), who stressed the need for ‘the definition of specific abilities and the 
elucidation of the mode of transmission of some of these abilities’ (p. 283). With regard to 
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ethnic differences, genetic factors are generally relegated to a residual explanatory category 
in the spirit of John Stuart Mill. One of the exceptions is a hypothesis put forward by 
Jensen (1975, 1978), to be described later, that is specific enough to be capable of being 
tested with the present data. 

Neurological accounts focus on cerebral localization, though there are differing 
hypotheses about this process by Buffery & Gray (1972) on the one hand, and Levy (1969, 
1976) on the other. All are concerned only with sex differences, and no adequate 
information is available about ethnic variations in lateralization. À recent version of the 
neurological approach proposes that it is not sex as such, but the greater rate of physical 
maturation that affects the organization of the higher cortical functions; late maturers of 
both sexes are said to show greater lateralization and higher spatial scores (Waber, 1976, 
1977). Since there is evidence that Africans do not reach sexual maturity earlier than Scots 
(Tanner, 1961; Donovan & Bosch, 1965), the hypothesis is not relevant to cultural 
differences in the present sample. 

Most common are environmental formulations, ranging from very broad to highly 
specific. Prominent among the former is that of Berry (1966, 1971), often mentioned in 
discussions of both cultural and sex differences in ‘spatial abilities’. Briefly and rather 
crudely, Berry concentrates on modes of socialization as constrained by the eco-system, the 
status of women within the culture also being taken into account. Berry's findings tend to 
be cited as showing that there are a number of cultures where sex differences in ‘spatial 
ability’ are absent. However, it has been noted by several commentators that out of the test 
battery he used only one could be regarded as a strictly ‘spatial’ test (Irvine, 1969; Vernon, 
1969; Grant, 1970; Ord, 1970). Moreover, in his more recent work Berry (1976) himself 
was careful to refer merely to sex differences in psychological differentiation. If one wished 
to base a prediction on his earlier ‘spatial’ formulations, it would be that spatial 
performance should be higher among the Scots. Sex differences in favour of males would 
also be expected, but one could probably not predict their relative magnitude in the two 
cultures. . 

While Berry operated with such broad mediating factors as degree of strictness of 
parental control, more direct influences were envisaged by Sherman (1967). She viewed the 
development of spatial abilities as shaped by culturally determined sex-role patterns 
leading to differential experience with relevant materials. Thus on this ‘bent twig’ 
hypothesis, as Sherman called it, social pressures will result in boys spending more time on 
such activities as block building or model construction. Thereby superior manipulative and 
spatial skills are acquired, which then generalize to all types of spatial tasks. This 
hypothesis enjoys wide support (e.g. Birns, 1976; Salkind, 1976; Connor et al., 1978) 
though it should be noted that all the evidence ~ and it is mainly indirect — comes from 
within western industrial cultures. In Ghana, apart from a tiny élite minority, neither boys 
nor girls have access to building blocks, mechanical toys, or such-like. Children's activities 
are directed toward adult sex roles, girls being more oriented towards domestic tasks, while 
boys are more concerned with farm work. There is, however, a great deal of overlap, 
especially in the case of younger children, where both sexes may be expected to look after 
infants. It is possible to observe some sex differences in manipulative play; for instance 
boys, but seldom girls, make ‘lorries’ by fixing a round tin to the end of a stick. However, 
there is no indication of any pressure towards such differentiation on the part of parents, 
who display a marked indifference towards children's play unrelated to adult roles. Given 
these contrasting conditions, it is possible to make a clear prediction based on Sherman's 
hypothesis. Before doing so, a brief account will be given of Serpell’s (1979) position, which 
in this case leads to exactly the same prediction. As he freely acknowledges, it is near the 
* specific? extreme of the generality-specificity continuum. Rejecting any broad cognitive 
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constructs such as Vernon's (1967) ‘practical intelligence’ or ‘field dependency’, conceived 
as determined by the global character of the environment, Serpell focuses on the subjects’ 
familiarity and direct experience of particular spatial-perceptual skills. Unlike Sherman, he 
is not greatly concerned with the socio-cultural factors supposedly governing exposure to 
relevant experiences. 

The background of specific experience relating to the fitting together of parts of 
three-dimensional objects is fairly clear as regards the present samples: Scottish children 
will have had considerable experience in this sphere, and boys are likely to have had 
substantially more than girls; but Ghanaian children of both sexes will have had little, if 
any, relevant experience. Thus a prediction conforming to both Sherman's and Serpell's 
formulations would be as follows: 

(a) compared with Scottish children, Ghanaians will perform less well on the task; 
(b) a significant sex difference will be manifest for the Scots, but be either absent or at least 
significantly smaller among Ghanaians. 

Prediction (a), common to all environmental hypotheses, is of course fairly trivial, while 
(b) concerning intra-cultural sex differences is critical. 


Method 

Subjects 

These were 80 schoolchildren in both Ghana and Scotland, consisting of 10 boys and 10 girls drawn 
randomly from the class registers ın each of four classes. In Ghana these were Middle School class 
1-4, with modal ages approximately 13, 14, 15 and 16, though there was considerable (mainly 
upward) variation within the classes; in Scotland the classes were Primary 7 and Secondary 1-3, with 
mean ages close to 12, 13, 14 and 15, respectively. This means that the two samples were equated for 
years of education, but not age. Ghanaian children were from the middle school in a village housing 
mainly junior workers at the University of Ghana, including clerks, artisans, porters and labourers. 
The Scottish children were at schools whose catchment area is mainly semi-skilled working class, the 
primary school being one of the feeders for the secondary one. 


Materials 


The development of a suitable task with an appropriate gradient of difficulty, yet simple to 
administer, required extensive piloting. Initial exploration indicated that a task of the kind 
employed by Vandenberg & Kuse (1978) could not be readily explained to Ghanaian 
children. The basic format eventually adopted was that of sets of three-dimensional shapes 
which either could or could not be put together to form a cube. Originally 20 such sets 
were used, half of them fitting together and the other half distractors. The problem was 
then encountered that such tasks were either too hard for all but a few children, or the 
distractors were too easy to recognize as such. In order to overcome this, the proportion of 
distractors was reduced from one-half to one-third; this made it feasible to ensure that all 
were *non-obvious' ones. The data from the earliest, excessively difficult, versions were 
used to test whether there were any sex-differences in guessing strategies; none of these 
tests even approached significance. 

The final version of the task consisted of colour photographs (88 x 130 mm) showing 
variously shaped pieces of plastic foam against a neutral background. In the study 72 of 
these were used, embodying a systematic variation of angle of presentation in a balanced 
design. Since angle of presentation yielded no significant effect, it will be simpler to ignore 
it hereafter and refer to only 36 photographs in four sets of nine. Half the sets had two 
pieces per picture and the other half three pieces. Within each set there were six 
photographs of pieces which, when mentally rotated and assembled, would make a perfect 
cube (positive); the remaining three pictures contained pieces that would not make a cube 
(negative). Examples of the stimulus material are shown in Fig. 1 in the form of line 
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Figure 1. Examples of ‘positive’ (left) and ‘negative’ (right) components. 


drawings (for better reproduction). The front surfaces, darkened in the illustration, were 
red in the photographs. The purpose of this was to facilitate the task slightly; for the same 
reason some edges of the pieces were accentuated by dark lines, since they did not show up 
well against the light foam texture. 


Procedure 


At the outset subjects were shown one positive and one negative picture, together with the real foam 
shapes. The correspondence between the three-dimensional objects and their pictorial representation 
was made obvious by juxtaposing them. Then it was demonstrated that one of the pairs of pieces 
could be put together to form a perfect cube, while the other could not. Subjects were encouraged to 
handle the pieces and fit them together. This demonstration was followed by a training session in 
which four pictures were placed on the table, and at the top edge of each photograph the 
corresponding pieces were positioned 1n the same orientation. Subjects were then asked to guess for 
each one whether or not they could make a cube, and to verify their answer by manipulating the 
pieces. None of the subjects had any difficulty in understanding the task. 

Next it was explained that they would be given more of these pictures, and ali they would have to 
do was to decide whether the photographs showed pieces that would or would not make up a cube 
and sort them accordingly. In order to avoid any ambiguity, a model cube of the correct size was 
placed on the table on their right. The first set of nine photographs was then arrayed in front of 
them ın a 3x 3 matrix in random order, the same procedure being followed with the remaining three 
sets Subjects were allowed to set their own pace and there was no time limit 

Responses were scored in two alternative ways The first was based on the outcome of the previous 
tests showing no significant differences 1n guessing strategies and assuming therefore a chance score 
of zero. The second introduced an additional safeguard by avoiding any assumptions about guessing 
strategies. Both yielded 1dentical outcomes, and only the latter will therefore be explained. 

It will be recalled that subjects were asked to place those pictures they judged correct on the right, 
and as these responses contained all the information (since the remainder must be on the left) they 
were used to arrive at the score. On a chance basis one would expect two-thirds of the responses on 
the right to be correct. Hence the score was taken to be the amount by which the subject exceeded 
the chance base; it was multiplied by three to obtain a whole number. The formula is set out below: 


Score — 3 (number correct —4 total) 


Possible scores range from —23 to 4-24; the actual maximum score achieved was 21. 
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Results 


A 2x2x4 (culture x sex x educational level) analysis of variance was carried out. Main 
effects for culture (F = 42-22, d.f. = 1, 144) and sex (F = 16-97, d.f. = 1, 144) were both 
highly significant (P < 0-001); but educational level was not significant, nor were any of the 
interactions. Mean scores are shown in Table 1, where the higher performance of Scots of 


Table 1. Mean scores by culture and sex 











Scotland Ghana 

Boys Girls Boys Girls 
Mean 12:13 8-73 7-05 4:65 
SD 4-64 4-44 3-99 455 





both sexes is evident, though unsurprising. The salient features from the present standpoint 
1s the fact that the magnitude of the sex differences is the same in both cultures, as 
indicated by the absence of an interaction in the ANOVA. Hence the findings clearly fail to 
bear out the prediction that the sex difference in Ghana should at least be smaller than the 
Scottish one. 

Discussion 

The outcome of the study has a bearing on several hypotheses concerning sex and ethnic 
differences. First, there is the previously mentioned genetic hypothesis of Jensen. He made 
a prediction, couched in terms of American black-white comparisons, that ‘the size of the 
sex difference should be a monotonic function of...the proportion of Caucasian 
admixture’ (1975, p. 160). It will be evident that this prediction, which assumes not merely 
maximum spatial deficit but also minimal sex differences ın ‘pure’ blacks, is falsified by the 
present data. 

The results are also inconsistent with Sherman’s (1967) ‘bent twig’ hypothesis, indicating 
that social pressures towards sex-specific toy usage cannot be a very salient factor 
accounting for sex differences on spatial-perceptual tasks. This view is supported by the 
findings of Hutt (1970) showing that even pre-school children displayed distinct preferences 
according to sex for novel toys that were not sex-typed. 

It is less easy to assess the implications for Serpell’s (1979) position, which requires more 
detailed consideration. In his exposition Serpell sometimes referred explicitly to 
cross-cultural differences in pattern reproduction, yet in other passages appeared to imply 
that his formulation extended to every kind of spatial-perceptual task. If the interpretation 
was intended to be confined to pattern reproduction, which seems unlikely, then the 
present findings dealing with mental rotation would simply be irrelevant. However, it 1s 
possible to question whether, on his own evidence, Serpell has really established his thesis. 
He argued that his cross-over design (whereby English children were better at drawing 
tasks, whereas the Zambians proved superior in wire-modelling) compels the conclusion 
that differential acquisition of highly specific perceptual skills must be exclusively 
responsible. The logic of such an inference is questionable, in so far as it seems to exclude 
the operation of any dispositional factors. Consider the facts as reported: one task 
(drawing) was reasonably familiar for both cultural groups, the other task (wire-modelling) 
was totally unfamiliar for English children but well-practised by Zambian boys. It may be 
mentioned in passing that, given this lack of familiarity, the English children produced a 
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fairly creditable performance. The point to stress, however, is that a dispositional view does 
not necessarily entail a superior performance on the first encounter with a task. Apart from 
this objection in principle, there is considerable internal evidence that the ‘specific skill’ 
hypothesis did not readily account for several aspects of the findings, so that Serpell even 
had to have recourse to subsidiary motivational! interpretations. 

All this ts not to deny the importance of specific perceptual skills, but suggests that it is 
not justifiable to dismiss dispositional factors, including genetic and neurological. On the 
other hand Serpell’s rejection of broad and general cognitive constructs as explanations of 
differences in spatial-perceptual performances 1s entirely congruent with the present writer's 
critique of the notion of a single generalized ‘spatial ability’. There 1s mounting evidence of 
the existence of a variety of distinct spatial-perceptual abilities and skills, though most of 
the work of isolating these and especially exploring their several determinants remains to be 


done. 
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Book reviews 


The Structure and Measurement of Intelligence. By H. J. Eysenck. Berlin: Springer Verlag. 1979. Pp. 
253. $24.20. 


The time is ripe for a good book on intelligence (beyond the fact that a good book on anything is 
always apposite). Modern methods of factor rotation and the resolution of other technical niceties in 
factor analysis have enabled a consensus of results in the field of abilities to be obtained, results 
which seem to be misunderstood or unknown to many psychologists. In addition, recent work in 
biometric genetic methods and controversies over the inheritance of intelligence, the general 
egalitarian Zeitgeist and the flight from numeracy in psychology (labelled Humanism) all make it 
necessary to summarize accurately and clearly what is now known about intelligence. 

The first good point to note about this book is that all these critical issues are discussed, indeed 
most have separate chapters so that there can be no doubt that Eysenck has treated the right topics. 

The chapters on heredity and environment are excellent. This is hardly surprising since they are the 
part contribution of David Fulker who is a leading light in the field of biometric genetics and who 
brilliantly exposed the fallacies of Kamin’s approach to this problem. It is a pity, although ıt does not 
invalidate the chapters, that they were written before the final proof that Burt had cheated However, 
the explanation of the biometric method and its utilization on the analysis of data to demonstrate the 
high heritability of IQ scores is clear and one that most undergraduates should be able to follow. Nor 
do these writers ignore the influence of environmental factors on such scores. On the contrary they 
greatly demonstrate how the findings that good home background and social advantages can create 
IQ differences of up to 15 points fit their model. 

The discussion of models of the structure of intellect is perhaps a little disappointing. Although 
Guilford's work is rightly criticized in the light of the factorial findings that Procrustes rotation 
methods can verify anything (1.e. can verify nothing) there is no mention of Cattell’s work — the 
ADAC model — and this seems somewhat partial. If unacceptable, it should at least be shown to be 

A final chapter is devoted to the social implications of the argument that there is a high heritability 
component for general intelligence, although there is little discussion of racial differences in 
intelligence. ; 

All the topics are clearly explicated with reference to recent research findings. Any undergraduate 
who knew and understood this book, could be said to have a good grasp of the subject. Indeed this 
reviewer will recommend it strongly to his students. However, it must be also said that 1979 has seen 
the publication of another truly excellent book — Intelligence, Heredity and Environment by Philip 
Vernon, a book which perhaps for teaching purposes 1s even better than that of Eysenck. No reader 
should choose one without first reading the other. 

In brief, Eysenck’s book is a clear up-to-date account of the main topics in the field of intelligence. 
It should certainly be read by any who would oppose the use of intelligence tests. Let them answer it 
if they can. i 

, PAUL KLINE 


G 


VERNON, P. E. (1979) Inteligence, Heredity and 
Environment. San Francisco, Cahf.: W H Freeman. 


Intelligent Testing with the WISC-R. By Alan s. Kaufman. New York: Wiley. 1979. Pp. xix 4- 268. 
£12.95. 


In this book Kaufman sets out to help escis using the WISC-R scales (R stands for revised) 
to make the best use of their data. Although Kaufman attempts to give his interpretative methods 
theoretical support from research findings, basically this is a book for practitioners by a practitioner. 
It distils the essence of years of experience. 

This practical approach has its deficiencies. For example, he claims that WISC-R results fit 
Cattell’s distinction of crystallized and fluid ability, coordinate with Piaget’s experimental tasks, are 
classifiable in terms of right brain or left brain, and can be defined in terms of Guilford’s structure of 
intelligence factors. Unfortunately this last can hardly be to the advantage of the WISC-R scales since 
it has been shown that Guilford factor-structure is an artifact of the Procrustes rotations used to 
verify it. 
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On the other hand, in terms of practice there can be no doubt that what Kaufman advocates is 
eminently sensible. Essentially he argues that the tester should break away from the bounds of the 
actual scores and use his intelligence (hence the title) to gain the maximum possible information 
about the intellectual functioning of the child. The tester indeed is to act as a cognitive detective. 

The basis of the detection which most of the book is concerned with rests on three assumptions 
about the WISC-R scales, from which indeed all the rest logically arises: 

(8) that the WISC-R measures what an individual has learned. This should be compared with the 
claims of, for example, the Cattell Culture Fair test which it 1s hoped measures fluid ability; 

(b) that the WISC-R scales are simply a sample of behaviour; 

(c) that the scales assess mental functioning under fixed experimental conditions, this last countering 
the well-known clinical claim that children often fail to answer what they know ın the WISC 

A valuable feature of the book are the case studies where individual case reports are discussed in 
considerable detail. In these readers are shown how WISC-R scale scores are interpreted in the light 
of other information and test data. These well illustrate the imaginative reasoning advocated by the 
author. 

There 1s a long section on the interpretation of subtest differences including a step-by-step guide. 
This, of course, flies in the face of the obvious difficulty caused by the relatively poor reliability of the 
subtests which Kaufman admits. His solution to the problem 1s probably typical of the skilled 
practitioner, ‘...I have a special fondness deep down for the theoretical soundness of ardent 
empiricists. . .' (who claim that subtest differences cannot be used) but to follow those precepts would 
mean the abandonment of one's clinical skills that permit going beyond the verbal IQ. Thus in the 
end Kaufman believes clinical insights in the individual case are more powerful than statistical 
probabilities. 

The book 1s clearly written with much illustrative material. For the practical psychologist using the 
WISC-R scales it would be folly not to read this book. Indeed it 1s a useful guide for individual 
clinical assessment — a must for courses in educational and clinical psychology. 

PAUL KLINE 


Experience and the Growth of Understanding. By D. W., Hamlyn. London: Routledge & Kegan Paul. 
Pp. 159 £6.50. 


This book is one of the International Library of the Philosophy of Education edited by R. S. Peters 
and is directed at a readership of educational philosophers and their students, rather than 
psychologists. However it will be of interest to educationalists generally and to developmental 
psychologists, especially Piagetians, whether they are for him or against. Professor Hamlyn is already 
established as an authority in philosophizing about Piaget. In this book he spreads his net wider, in 
particular contrasting the approaches that Chomsky and Skinner did and might take to the 
development of knowledge. In the end one has the impression that Piaget remains his favourite 
theorizer because his theory is both genetic and structural. In the early chapters Hamlyn expounds 
and annotates the empiricist notion that knowledge ıs entirely acquired through experience (there is 
an interesting section on what Aristotle really meant by induction) and discusses the problem of the 
infinite regress — that knowledge must build on previous knowledge, and contrasts this with the 
nativist ideas of Gestalt psychologists and Chomsky who are seen as implying that knowledge can in 
some sense be innate. 

As one might expect from a rational structuralist, Hamlyn takes a middle road which is a bit like 
Piaget’s. He believes ın particular that Piaget has not dealt adequately with the objectiveness. of 
knowledge and with the impression one has that knowledge 1s shared. (It is interesting to speculate 
what the shared knowledge would be between a member of a rediscovered tribe 1n New Guinea and 
Professor Hamlyn.) In proposing his own middle road, he thinks that the ability to distinguish 
between self and non-self and to develop a concept of a person are essential. For knowledge to grow, 
trial-and-error is not enough The child ‘has to be put in the way of things by adults; his attention is 
thus directed and focused, in such a way that questions of relevance would not need to arise even if 
they could’. This is one of the reasons that lead Hamlyn to put emphasis on the relations between 
child and adult, both for learning and teaching. 

This reviewer, being untrained philosophically, cannot do justice to many of the arguments, nor 
does he find some of the writing easy (e.g. ‘It appears that when a child is born it knows nothing and 
understands nothing — and I speak in a qualified way, speaking of what appears to be the case, only 
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so far as not to beg questions which will arise later.) From a, psychologist’s point of view, there are 
many issues one would want to take up. For instance, when the author discusses the way a child 
comes to discriminate, the child is treated as almost passive. In practice the child interacts with its 
environment and the development of discrimination depends largely on the outcome of those 
interactions; and since the child's actions will be affected by its motivations, those too will affect the 
development of discriminations. Perhaps more important is the way in which Hamlyn deals with the 
word ‘knowledge’. Of course he makes Ryle's distinction between ‘knowing how’ and ‘knowing 
that', but a psychologist might want to make many more subdivisions, for the crucial reason that 
different kinds of knowing. involve different psychological functions — knowing (recognizing) a person, 
knowing how to carry out a skill; knowing techniques for solving problems; being able to put a 
principle into words; understanding an explanation; knowing names for things; knowing one's way 
about a city; having a concept of conservation. 

Hamlyn uses the word learning in a restricted sense. (Pavlovian conditioning is excluded; and one 
may come to know something in the absence of learning.) However, he 1s worried by the word and 
says, ‘If the term “learning” were to cause difficulty I should be © happy to drop it'. Could he be 
persuaded to drop the term ‘knowledge’ instead? 

B. M. FOSS 


Pavlov. By J. A. Gray. London. Fontana Modern Masters. 1979. Pp. 140. £1.25. 


The Fontana Modern Masters series has been widely acclaimed, and the addition of a book on 
Pavlov — a physiologist whose work is usually part of core lectures to undergraduate 

psychologists — is very welcome. There can have been few cases where the selection of the student to 
describe the master-can have been more apt than in the choice of Jeffrey Gray to give the account of 
Pavlov, since Gray is active in several experimental areas much influenced by Pavlov, and is also 
familiar with modern Russian writings ın these fields. The outcome is well up to the standards of the 
series; a volume which 1s succinct, and accessible to the layman or undergraduate, which nevertheless 
incorporates a number of judgements and comments-which will engage and intrigue professionals 
Anyone remotely interested 1n Pavlov should read it. 

Gray begins by deftly sketching in the biological and philosophical background to Pavlov's 
psychological associationism. It is easy to forget that within Pavlov's working life the concept of 
individual nerve cells (neurons) within the nervous system was hotly contested, and that Pavlov 1s not 
responsible for the idea that conditioning experiences change synapses. There are interesting 
philosophical precursors to Pavlovian attitudes in the writings of the Scotsmen Hume and Hartley. 
(Although Pavlov gave more direct acknowledgement to Descartes' concept of reflex action, Spencer's 
treatment of instincts, and studies of cortical function by, among others, Ferrier, Goltz and Munk.) 
Pavlov's general theories of brain function followed after more detailed work on neural control over 
*the blood and the gut' had gained him the Nobel prize. His fame rested initially on his skill as an 
experimental surgeon — where others had failed, he.was able to externalize a portion of the stomach 
(Pavlov's.pouch), from which secretions of the pancreas could be collected. The influence of the taste 
and smell of food.on stomach activities was measured by ‘sham feeding’ — pure gastric juice, 
uncontaminated by food itself, was obtained from a gastric fistula when an additional operation 
prevented food from reaching the stomach. Gray makes an exciting narrative out of the interplay of 
theories of neural and hormonal control of digestion secretions, which Pavlov had to integrate in his 
Nobel lecture of 1904. (My only. quarrel is with the implication that gastric secretions were collected 
in man long after Pavlov's experiments with dogs: in fact Pavlov was explicitly trying to replicate in 
dogs observations published in the 1830s by Beaumont, made on a Canadian trapper with an 
unhealed gunshot wound in the stomach wall.) 

The two introductory chapters are followed by a fairly standard ‘ Hilgard and Marquis’ description 
of the methods and results of experiments.on salivary conditioned reflexes, fleshed out with the 
Gercia ‘bait-shyness’ phenomenon. Then under ‘The Theory of Conditioning’ Gray firmly puts the 
post-Pavlovian case for separate processes of classical and instrumental conditioning. The Pavlovian 
sort is now airily defined as something which ‘enables the animal to learn the relationships between 
stimulus events in its world’ which amounts to the animal ‘knowing that’ (p. 64), while instrumental 
conditioning results in ‘knowing how’ to achieve ends. Pavlov himself was very complimentary about 
Thorndike's knowing-how-to-get-out-of-the-puzzle-box experiment; and noticed that his own dogs 
would learn to shake the food delivery apparatus to get food out for themselves, if given the 
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chance — but said everything was a matter of forming associations in the hemispheres, and left it at 
that. Another post-Pavlov development is the assessment of the sensitivity of animals to probabilistic 
relations between stimuli, and Gray concludes that Paviov may have over-emphasized temporal 
contiguity, although Pavlov rarely mentioned contiguity (it was Guthrie who did) but liked to talk 
about ‘signalling functions’ of stimuli 

Pavlov’s theory of the brain mechanisms which underlie conditioning and signification involved two 
forms of brain activity, ‘excitation’ and ‘inhibition’, which ebb and flow about the cortex 
(‘concentration’ and ‘irradiation’) in a rather Gestalt-like manner. No one seems to believe in the 
ebbing and flowing any more, but ‘inhibition’, in an even vaguer substantiation, is still popular as an 
explanation for why an animal does not do something. Gray suggests that theories of brain function 
in conditioning have made little progress since Pavlov. 

Like Piaget with children, and Skinner and Lorenz with other species, Pavlov established a 
considerable reputation as a recorder of empirical phenomena without ever resorting to a textbook of 
statistics. One consequence of an emphasis on individual animals was the Pavlovian theory of 
personality. Dogs selected as friendly and vivacious were very poor at conditioning, as they reacted to 
the tedium and constraint of the experiment by either biting everything in sight or going to sleep. On 
the other hand, withdrawn and cowardly animals were excellent performers in the conditioning stand, 
provided that they did not get too nervous and go off their food. Pavlov adapted the Greek theory of 
temperaments to account for this, and, for better or worse, became responsible for the Eysenck 
Personality Inventory. Gray modestly omits to mention his own contributions to personality theory, 
but rightly suggests that Pavlov's influence is still visible, not only in the EPI, but in some forms of 
behaviour therapy as well. 

The influence of Pavlov on theories of animal learning has of course been profound. But it can be 
argued that experimental advances in classical conditioning since Pavlov have been inconsequential, 
and that interpretations of Pavlov in the English-speaking world have been misguided. Pavlov (1930) 
himself wrote a vigorous attack on the way his work was then being treated by Guthrie and Lashley, 
but to no avail. As Gray points out, Pavlov would be just as ‘surprised and distressed’ today by the 
lack of interest 1n the brain shown by Western psychologists who study conditioning. Although 
Pavlov quite naturally pushed hus own methodology of salivary conditioning, he constantly reiterated 
that what he was really after was an understanding of the ‘analysing and synthesizing’ activities of 
the cerebral cortex. Gray might have achieved a better balance between the real Pavlov and the 
image seen in what the cover blurb calls the ‘distorting mirror of Behaviourism’ had he mentioned 
Pavlov's own attempts to manipulate cortical activity directly, by the lesion method. These attempts 
would probably have been more extensive but for what Pavlov called his ‘big mistake’ (Pavlov, 1927, 
p. 321) of trying to make cortical lesions without any loss of blood, by a technique which led to 
post-operative scar tissue Despite these difficulties, Pavlov produced results which would give him 
something to talk about in modern laboratories of physiological psychology. For instance, he 
investigated what Munk had called ‘psychic’ blindness, ın dogs with ablations of the visual cortex: 
they had no normal object vision, but could be trained to give salivary reflexes to different levels of 
luminosity, and eventually to distinguish between a luminous cross and circle (Pavlov, 1927, p. 343ff). 
The conclusion drawn was that *the dog understands but does not see sufficiently well'. Extensive 
work was done on the nature of auditory discrimination after temporal lobe ablations, and there were 
a number of very modern-sounding experiments on tactile discrimination in hemi-decorticate and 
split-brain dogs. The conclusion in the latter case was that there ıs no homolateral connection of the 
skin with the cortex, but with the commissures intact there are strong point-to-point connections 
between tactile sensations on different sides of the body, ın dogs. 

Not satisfied with the lesion method, Pavlov started work with multiple implanted electrodes The 
point is that it might be fairer to Paviov to emphasize his role as a brain scientist, coming between 
Ferrier and Munk, and for instance, Luria and Sokolov, rather than to limit his influence to the 
invidious position of being progenitor of the theories of Guthrie and Hull. Much of what Pavlov said 
about ‘analysing and synthesizing’ in the ‘functional mosaic’ of the cortex makes a certain amount of 
sense if each element of the mosaic 1s a modern cortical column, and even ‘irradiation and 
concentration’ of neural activity may not have been too far off the mark, in the light of what is now 
known about the topographic organization of sensory cortex. 

A final area in which Gray’s account provokes a number of questions is the network of 
relationships between Pavlov, his theories, and the Soviet state. Gray notes the decree signed by 
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Lenin in 1921 which arranged for optimum conditions to be established for Pavlov's experimental 
work, and for Pavlov and his wife to receive double food rations. The same decree stipulated that 
government presses should print Pavlov's output in de luxe editións and that his apartment should be 
properly furnished. Gray states that Pavlov was hostile to Lenin’s regime, and that Party officials 
rejected Pavlov’s.theories as ‘vulgar materialism’. If this is so then the treatment of bourgeois liberals 
promulgating incorrect theories in the Russia of Lenin and Stalin must have been more benign than 
one has been led to believe. An alternative view is that there was a degree of mutual enthusiasm 
between Pavlov and the Russian government of the 1920s and 1930s which is now rather an 
embarrassment. Paviov began:in April 1917 (after the abdication of the Tzar, and just before the 
arrival of Lenin) by publicly welcoming the end of a *sombre epoch of oppression' (Pavlov, 1955, p. 
49). However, in the same speech (read in his absence to a conference of physiologists) he pointedly 
remarked that ‘A grievous sin was committed by the Great. French Revolution when it executed 
Lavoisier’. He obviously had some worries. But his treatment under Lenin could hardly have been 
moré favourable, and even this was bettered under Stalin, when a special ‘scientific city’, Koltushi, 
was built outside Leningrad, which included facilities for research on apes. In return, Pavlov, at the 
very least, stayed in Russia, rather than emigrating elsewhere. (It has been said that Pavlov was 
offered a life-time grant in the 1920s, by the Medical Research Council, if he would move to England 
As a full-time researcher under two Tzars, Lenin, Stalin, and with this offer as well, Pavlov was 
certainly outstanding in his ability to attract research funding ) More probably, Pavlov actively 
supported what he felt to be useful policies in his native country. His speech to a reception at the 
Moscow international conference in 1935 expressed passionate support for the ‘historic social 
experiment' of Russian government at the time. He visited collective farms and the last publication 
recorded in the official selected works (Pavlov, 1955) is a laudatory letter from Academician Pavlov 
at Koltushi to ‘leading miners’ in the Ukraine. Was Pavlov one of those prominent scientists who 
served on subcommittees of the Politburo in the 1930s and gave opinions on the suitability of 
propaganda for collectivization of the farms and worker emulation in the mines and factories? 
Perhaps not. But if Pavlov and his theories had nothing to do with social policies and propaganda 
techniques in Russia, then it is doubly ironic that the application of his ideas was more visible in 
Amefica. In 1921, the year of Lenin’s decree, J. B. Watson was signing on with J. Walter Thompson 
in New York, and before he left ın 1935, Watson had developed, among other things, the concept of 
brand loyalty, and the crudely Pavlovian apposition of the brand name with stimuli indicating high 
social status, which effectively influences consumer behaviour to this day. (One Watsonian variant 
was testimonial advertising and one of the products he worked with was coffee.) But it was Pavlov, 
not Watson, who said that ‘suggestion is man's most simplified and most typical conditioned 

reflex’. - 

Gray’s account of Pavlov thus raises more questions than it answers, and emphasizes the purely 
behaviourist and Hullian aspects of his influence. However, these can hardly be called failings in a 
book of this length and purpose and Gray should be congratulated, and the book recommended to 
undergraduates. It could be recommended even more warmly, if only it had an index, but there seems 
to have been a decision not to have indexes in this series. 

STEPHEN WALKER 


Paviov, I. P (1927) Conditioned Reflexes. An PavLov, I P. (1930). The reply of a physiologist to 
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Perception: The World Transformed. By Lloyd Kaufman. New York: Oxford University Press. 1979. 
Pp. 416. £10.75. 


Kaufman's earlier book Sight and Mind was, for my money, thé best book on visual perception of the 
past 10 years. The careful, analytical treatment of several important areas in perception contained 
some of the best expositions of difficult topics I have ever read. Sight and Mind is demanding: 
students find it difficult but absorbing. Now, in Perception: The World Transformed, Kaufman has 
brin a much more accessible introduction to some of thé key problems of perception. It is a very 
good book. 

The early chapters of Perception deal with the nature-of light and its detection, brightness and 
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lightness, colour vision, spatial frequency analysis and bar and edge detection. The standard of 
exposition in these chapters is so high that many of us will need to rewrite our lectures. 

Kaufman realizes that to explain perceptual phenomena in terms of analytical process in the visual 
system leaves unanswered the question as to how we achieve a coherent internal representation of the 
external world. Later chapters deal therefore with the problems of form and meaning, the nature of 
visual space and how we gauge distance and size. Whenever possible, Kaufman attempts to place 
phenomena into a theoretical framework he is concerned always with possible explanations of 
perceptual effects and how we can try to make sense of them. His own predilections are for 
explanations based on physiological mechanisms, but he wears his theories lightly and 1s careful to 
distinguish between that which is established and that which is still speculative. 

My only criticism of this book concerns the treatment of the non-visual modalities. After several 
full and detailed chapters on vision, the very short and elementary discussions of other sensory 
modalities are disappointing. Kaufman includes these topics because he wishes to conclude with a 
discussion of a perceptual system, taking as his starting-point Gibson's arguments concerning the 
integrated nature of the various senses. But the sections on hearing, touch, smell and taste are not 
very satisfactory. They are the wrong length. It would have been better to refer the reader to any of 
the standard introductions to these topics, or to have attempted a much fuller discussion. 

Generally, however, I welcome this book. One takes from it a strong impression of an extremely 
well-informed scholar who is also, clearly, a gifted and enthusiastic teacher. The book will be an 
invaluable text in undergraduate courses on perception for some time to come. 

I. E. GORDON 


Foundations of Contemporary Psychology. Edited by Merle E. Meyer. London: Oxford University 
Press. 1979. Pp. x - 726. £12.00. 


This text comprises 20 chapters, contributed by various authors and aims at a comprehensive and 
contemporary introduction to psychology. À problem facing any such attempt is the need to retain 
coherence and yet to convey the diversity of topic and approach. One solution is to forgo any 
integrated account in the face of the current state of the subject-matter and to emphasize the positive 
side of controversy and variety as forces encouraging development. The editor adopts this solution 
and the book as a whole makes enjoyable reading. Each chapter is viewed as complete in itself and 
no order of reading 1s recommended. 

The text opens in traditional manner with a glimpse of various methodological and philosophical 
issues and raises the question of whether psychology is to be considered pre- or multi-paradigmatic. 
The following chapter, after surveying the development of psychology in terms of the attitudes and 
prescriptions of its practitioners, concludes that although psychology lacks a universally agreed 
paradigm it follows the methodological prescriptions of science in general. Measurement is one of 
these prescriptions and it 1s treated clearly and concisely as a topic, together with the basic methods 
of data analysis and psychophysical techniques, in the next chapter. 

The succeeding two chapters consider the neural and sensory aspects of psychological functioning. 
The former includes an overview of neural transmission and brain anatomy whereas the latter gives a 
detailed and well-illustrated account of the neurophysiology of the senses. The ‘hardware’ aspects of 
psychology receive further treatment 1n a chapter on behavioural genetics and also in a chapter on 
comparative animal behaviour written within the ethological framework of Tinbergen. This latter 
chapter concludes with a comparative study of copulatory behaviour in mammals, wrily entitled * An 
attempt at synthesis'. The behaviour of selected two- and four-legged creatures undergoing classical 
and instrumental conditioning is not neglected either and 1s well covered following a chapter on 
motivational influences on behaviour. 

Perception, memory and cognition in humans as well as the development of these functions are all 
treated 1n separate chapters with the chapter on perception achieving a mastery of exposition. Àn 
excursion into the world of sleep continues the reports on aspects of human consciousness. From this 
point on the book is centrally concerned with issues in social psychology, cross-cultural psychology 
and individual differences. 

Various aspects of social behaviour and the application of psychological research to social 
problems such as conflict and prejudice form the content of two chapters À subsequent chapter 
provides a cross-cultural perspective on social performance and perception and stresses the value of 
such research, both in the creation of psychological laws and 1n ameliorating international social 
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problems stemming from different views of the social world Individual differences within a culture 
receive treatment in a chapter on psychological assessment. This includes an extensive section on the 
recent IQ controversy. The final two chapters consider the various attempts to conceptualize 
personality and to explain and treat its pathological aspects. 

Overall, therefore this text reflects topics usually dealt with in introductory courses. It strikes a 
reasonable balance between chapters on the physical bases of behaviour and those concerned with 
purely behavioural or experiential descriptions. It excludes certain topics such as human non-verbal 
communication and.motor skills as-well as devoting little space to experimental cognitive psychoiogy 
and the import of work in artificial intelligence. 

The chapters conform to a common format (viz: iniréduetiuh: report, summary, glossary and 
suggested readings) but vary in style and detail of presentation. Such variety has merits but some 
styles seem more suitable than others. One function of an introductory text is perhaps to encourage a 
student to become involved in an area by stimulating questions Some chapters achieve this goal by 
indicating zones of uncertainty and points for further research. (The chapter on perception is a good 
example of this strategy.) Other chapters seem less conducive to such involvement. In part, this 
difference in style reflects whether a chapter has been written primarily to orient the reader in a 
research area of wide scope (such as the chapter on developmental psychology) or whether it aims at 
a detailed presentation of a more defined area (such as the chapter on the neurophysiology of the 
senses). The latter tend to be more open textured perhaps because they can afford to be. 

There are some good attempts to handle the variety of approaches in an area. In the chapter on 
cognition for instance a simple problem 1s considered from associative, Gestalt and information- 
processing points of view. In other cases, such as the account of various personality theories, the 
alternatives are treated as alternative models of personality that have evolved partly to explain similar 
phenomena and partly to account for different phenomena Of course, whether or not an 
introductory text should rest at the point of displaying variety or should go on to propose certain 
lines of i integration is open to question. Personally, I would have.welcomed some such attempt. For 
example; an information-processing approach might be held to encompass the salient features of both 
the associative and Gestalt points of view since it eens to nae the organization of component 
processes. 

One of the merits of this text is that it raises iii issue of the social responsibility of psychologists at 
various points. It tends however to assume that the application of psychological research to social 
and personal problems invariably leads to an increase in well-being. This possibility and its 
alternative might perhaps have been considered explicitly during the introductory chapter as part of 
the section on psychology and its practitioners. 

À new text is presumably intended to ündergo revision. Foundations of Contemporary Psychology 
will need to be revised. if it is to meet its goal of fully reflecting current psychology. Whether such a 
revision will occur depends ın part on its success in the face of Introduction to Psychology, vol. 7 
(Hilgard et al., 1979). As it stands it lacks an associated study guide and may therefore begin its 
career in this country'as a library rather than a student acquisition. 

DAVID GREEN 


Eus E. R., ATKINSON, R. L. & ATKINSON, R. C^. 
(1979). Introduction to Psychology, seventh ed New 
York & London: Harcourt Brace-Jovanovich. 


J. B. Watson: The Founder of Behaviourism.. By David Cohen: London: Routledge & Kegan Paul. 
1979. Pp. 297. £8.95. 


Watson is a worthy subject for a full-length book and David Cohen admirably fills the role of 
biographer with this readable account of a colourful life, important contributions to psychological 
research and the creation of a new ‘ism’ that changed the direction of psychology and deeply 
influenced everyday thinking. For more than half a century Watson's achievements have been 
neglected and his views persistently misrepresented, ironically, for Watson himself invented the style 
of advertising that creates a good ‘image’. This book should help to give Watson the place he 
deserves 1n the history of psychology. 

Why has Watson lacked a proper biography until now? Reasons are not difficult to find. Famous 
men often try to help their biographers, but Watson did not, and his curious autobiography 
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published in 1936 is daunting because it hints at so much that would be of interest if only the details 
could be obtained. Cohen has succeeded in marshalling enough material from scattered sources to 
put some flesh on Watson A complicated character emerges, attractive and irritating, impulsive and 
persistent, belligerent and conciliatory by turns, and as enigmatic as the Gellert cartoon appearing on 
the jacket of the book (regrettably its only illustration). There are other possible reasons for neglect. 
Watson lived on for 38 years after his academic career came to a sensational end in 1920, and for a 
time he turned to public lecturing and to writing dogmatic and sometimes sensational articles for 
popular magazines, so alienating himself from academic psychologists. After his divorce from 
psychology in 1920 and while making a new career in advertising he continued to supervise the 
well-known research on infants by Mary Cover Jones, but he left no disciples to carry on his work 
and indeed it 1s doubtful whether he ever had any, except perhaps the young Lashley. The 
neo-behaviourists of the 1930s and 1940s disregarded his research and continuity was broken, so that 
behaviour therapy, behavioural studies of infants, and comparative field studies had to be 
rediscovered later. Behaviourism was so closely associated with Watson, who had become notorious, 
that his successors 1n spirit tended to avoid applying the label to themselves. Even the recognition 
conferred by the belated award of the Gold Medal of the APF did little to restore his reputation and 
by now the majonty of psychologists know only the stereotype of Watson, the stupid bigot, which 
appears in so many unscholarly textbooks. 

Cohen's approach 1s open and sympathetic. He describes more than he evaluates, remaining 
unobtrusive and leaving Watson on the stage speaking for himself. The book may do more to destroy 
the stereotype because of that, and particularly because 1t does not generally gloss over Watson's 
faults and deficiencies, although it leaves out some of the worst excesses of his later writings, e.g. the 
description of a Utopia in which children would change ‘parents’ 13 times each year Just listing 
what Watson did makes the reader aware of the breadth of his interests, of his flair for research, and 
perhaps most surprisingly of the range of methods that he was prepared to use, including surveys by 
questionnaire as well as conditioning, and field observation of an ethological type. As Cohen gently 
insists ‘His behaviourism was not the rather constricted thing the word is now taken to mean’. 

The book has its shortcomings of course. For the historian of psychology 1t does not adequately set 
Watson's work into its context. Contemporary behaviourists are scarcely mentioned, and in the 
discussion of particular theories the reader would have been helped by a little more information on 
the problems of psychology as they were seen at the time. For example, the distinctive feature of his 
theory of thinking was the ahswer given to the question then current, of how thinking is carried on, 
whether by images, kinaesthetic or otherwise, or by impalpable contents or by inner speech or 
muscular contractions. Apart from his answer to this question, his theory is not very different from 
the others and it shares its inadequacies with them. There are some surprising lapses too. Now and 
again Cohen returns to a topic to make a point or a criticism that has been dealt with at an earlier 
stage. Occasionally there are unexpectedly naive comments, as for example when he says that a 
self-respecting behaviourist would have been much more predictable in his actions than Watson. And 
it ıs disappointing to catch glimpses of a cardboard B. F. Skinner akin to the cardboard Watson so 
convincingly destroyed by Cohen. There are other minor blemishes: the index is sketchy, the 
bibliography is incomplete and there are some inaccuracies in the text. The historian's device of 
prescience (little did they know that the Hundred Years War had begun) occurs with irntating 
frequency. And can it be rnght to anglicize the very titles of Watson's books? 

But the strengths of the book so outweigh its weaknesses that it would be wrong to end with 
anything other than an enthusiastic recommendation. This is a very successful attempt to present a 
fresh picture of a man and his work and it makes good reading. Even Watson, who dishked 
biographies, might have been able to see ment in it. 

R. L. REID 


Structural/Process Models of Complex Human Behaviour. Edited by J. M. Scandura & C. J. Brainerd. 
The Netherlands: Sithoff & Noordhoff. 1978. 


This book, based on a NATO-supported symposium held in 1977, contains articles by 12 authors 
Most of the contributions are followed by a commentary, written by Feibel, McWhinney, Reed, 
Reulecke, K. Wilson, or another contributor Usually, the commentaries are instructive and 
balanced. The majority of the contributors discuss complex problem-solving. Flavell goes one step 
further, discussing the acquisition and use of knowledge about problem-solving and other cognitive 
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phenomena. There is an emphasis on learning, development; individual differences (Hunt, Royce, 
Kearsley & Klare) and often some. reference to practical concerns (e.g. Larkin)..In general, common 
concerns are expressed throughout the book. However, the particular topics, the approaches and the 
basic theoretical assumptions often differ considerably — this I found. useful. Landauer, for example, 
effectively questions:a basic assumption held by many of the authors: that memory is organized in 
the sense that a relatively sophisticated set of procedures determines the economical storage of 
information and its subsequent access. 

Frequent reference ıs made to the use of rules in solving PM Most of the contributors believe 
that in learning to solve problems we learn to apply rules. Landa, among others, discusses the 
problem, both for psychological ‘theory and educational practice, that students ‘often know the rules 
but are unable to apply them. Brainerd, in an examination of learning discrimination shifts, rejects 
explanations of reversal shifts that are based on the application of rules such as ‘do the opposite’. It 
seems to me that in the volume as a whole there could have been more discussion of the differences, if 
any, between asserting that people use rules, that their behaviour is governed by rules, and that their 
behaviour exhibits regularities. Only Kempf, in a conceptual analysis of the term rule, appears to 
address himself to some of the issues involved here. Paivio, in reviewing Dual Coding Theory makes 
a claim for analogical representation, supporting his arguments as usual by a mass of data it is 

-impossible to igriore. But, like everyone else who writes on this topic, Paivio seems unable to specify 
clearly the fundamental properties. of the analogue format. In reading this book, I ask the question: Is 
problem-solving in an analogue format compar with the use of rules? I do not know and I do not 
find the answer provided. 

Some of the contributions are highly techincal (e g. Pask, Scandura, Klahr) using terms from 
computer programming, cybernetics, or other formal symbols. Sometimes one feels, as one of the 
uninitiated, that sound ideas are being proposéd, but.that the technical terms merge into Jargon. 
This scems, in particular, to be true of Scandura's contribution on Structural Learning Theory (about 
one-quarter of the book). Scandura begins with a laudable commitment to the specification of some 
comprehensive theory of problem-solving that 1s applicable to all tasks. He then goes on to make 
claims about the mechanisms that must be postulated ın any comprehensive theory — mechanisms 
such as those that determine how available knowledge is to be used 1n any particular task. However, 
it requires considerable effort to become acquainted with the technical terms he uses and it 1s 
impossible beforehand to assess whether it 1s all going to be worthwhile. Contributions from 
McDermott and Klahr do not fall into this error. McDermott provides a very clear account of what 
production systems are and why they seem espécially suited to the kind of problem-solving human 
beings have to solve in the natural environment. Klahr focuses on the construction of production 
systems to account for the behaviour of children in problem-solving. These contributions can be 
recommended to those who, having no basic knowledge of production systems, wish to know what 
they are. ! 

Kalechofsky’s chapter will probably interest the non-specialist Kalechofsky draws a parallel 
between Piaget’s theory of cognitive development and: the change in our conceptual analysis of the 
world as seen in the history of science. (Piaget is there in the background in a number of 
contributions.) The parallel is striking and anms but if it 1s Pu a parallel what is the sense 1n 
drawing it? 

This book is not for browsing and with the exceptions noted fiere the contributions are not for the 
general reader. Since most of the authors write on topics they have examined before my advice is to 
take a look at.the book if you consider that an author's previous work is either of interest or 
importance. i t.t . 

RICHARD WILTON* 


Mechanisms. of Learning and Motivation. Edited 5s A. Dickinson & R. A. Boakes. Hillsdale, N.J.: 
Lawrence Erlbaum. 1979. Pp xiii +468 £18.25. 


Subtitled ‘A memorial volume to Jerzy Kornorski', the papers making up this book are derived from 
the Kornorski memorial conference held at Sussex in 1977. While some useful editing and cross- 
referencing has taken place, the aim has remained unchanged — to select aspects of Kornorski’s work 
drawn from The Integrative Activity of the Brain (1967) and to relate those ideas to recent 
developments in learning and motivation, thereby paying tribute to his contribution to these areas. 
Inevitably then, this is less of a comprehensive text than other strategies might have produced — for 
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example, punishment receives attention only in the paper by Gray et al. Equally, the common theme 
does yield a series of offerings which relate to each other. It 1s perhaps also worth noting that the 
emphasis is on animal psychology, again a deliberate choice by the editors. 

The list of prestigious contributors alone will ensure that this book will become a major source of 
reference for specialists within the field of animal learning and motivation, and this is obviously the 
intended market. Undoubtedly it will also ensure that Kornorski's work and ideas filter down to 
undergraduate lecture courses more than they have in the past, a consequence of which the editors 
and contributors would certainly approve. The diversity of contnbutors also ensures a diversity of 
opinion and interpretation of the general aims of the endeavour. It does, however, seem possible to 
identify certain characteristics common to most or all of the papers. 

First, there is something approaching adulation for Kornorski's ideas and work evident in all the 
offerings, and most contributors seek the opportunity to include an insight or anecdote about him 
This is perhaps reasonable in a memorial volume, but it sat more easily in the ‘live’ atmosphere of 
the conference. In the book the implicit ‘as Kornorski might have said’ when an actual reference is 
not available becomes a little irritating. An alternative, and possibly preferable, strategy would have 
been to expand Halliday's lucid and excellent introduction and restrict the eulogies elsewhere a little. 

Second, there is an almost total verbal acceptance of the validity of Kornorski's approach ~ an 
emphasis on the interrelationships of brain and behaviour — although claims such as Gray's that it 1s 
almost unique to Kornorski do seem to stretch adulation a little far. This acceptance by the British 
and American contributors is somewhat surprising, as many would certainly have been labelled 
*Skinnerian' a few years ago However there is less behavioural evidence for the revision of ideas 
With the exception of papers by Zielinski and Dabrowska from the Nencki Institute and the 
established physiological psychologists (Moore and Gray), the contributions contain no surprises for 
those used to the traditional behavioural approach to learning and motivation. It may be that this 
volume is a signpost to an integrative approach, but at present it seems still a distant goal. 

A third common feature of the various papers, almost the corollary of the acceptance of 
Kornorski's approach, 1s the implicit or explicit view that learning theory would be 1n a much better 
state if Kornorski alone had influenced its development during the 1940s and 1950s. Skinner 1s still 
allowed some credit, but only Estes seems to have any lingering sympathy with the endeavours of 
Hull, Spence and others of that era. The case that Kornorski's ideas were too long ignored 1s 
legitimately and forcefully made, but an attempt to reverse the situation by destroying the old gods 
while heralding the new hardly seems to represent progress 

In other ways the contributors offer variety, particularly 1n the way that the aim of the collection is 
interpreted Hearst fulfils the stated aim admirably, with an excellent presentation and evaluation of 
Kornorski's and alternative views on classical conditioning. Coupled with equally lucid treatments by 
Dickinson and Mackintosh of instrumental conditioning and by Boakes of interaction between Type 
ĮI and Type II processes, it provides a very useful summary of some of the basic problems. In other 
contributions the need to relate work to Kornorski's seems more of an embarrassment. Moore solves 
the problem by a simple dedication of the chapter to Kornorski - legitimate perhaps in that he is one 
of the few actually exploring brain/behaviour relations Estes uses Kornorski's interest in ‘higher 
processes' to seek commonality between conditioning and cognition, but it sadly lacks the clarity and 
incisiveness of his treatment of drive in the late 1950s. 

Throughout the book there is rigorous discussion of many complex theoretical issues. Some are the 
familiar chestnuts of which there is still no resolution. For example Hearst and Mackintosh come to 
almost opposite conclusions on the ‘one or two process in conditioning’ debate. Problems of more 
recent interest, such as constraints on conditioning, show every sign of becoming equally thorny. 
That various views should be expressed and debated is obviously proper and necessary. What is 
lacking is an attempt to provide an overview and integration of the issues and problems raised. No 
easy task, admittedly, but surely a sad omission in a volume dedicated to Kornorski. 

PHIL WOOKEY 


Studying Children: An Introduction to Research Methods. By Ross Vasta San Francisco, Calif.. 

W H. Freeman Pp. 212. Cloth, £8.20; paper, £3.80. 
The aim of Ross Vasta’s book 1s to teach students of child psychology ‘how to approach the 
discipline! Since ‘facts are changing so quickly’ and theories ‘often have an ephemeral quality’, what 
we ought to be teaching our students, he argues, is how researchers set about studying children, since 
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this at least can be done with ‘a modicum of timelessness’. He approaches this task by describing and 
categorizing methods of research, ignoring for the most part the issues which they were developed to 
illuminate, and with only a rather casual consideration of the theoretical positions which make them 
seem appropriate to their users This curiously back-to-front approach leads to some very odd 
comments. The study of infants and children for example is said to be ideally suited to the 
developmental method because ‘a great deal of change occurs in a relatively short time’. It is difficult 
to believe that the aim to encourage the ability of students to assess new theories and facts in a 
critical fashion will be much advanced by this approach. Is an extended description of different 
techniques for conditiomng infants for instance really the most appropriate way 1n which to try to 
induce students to think critically about theories of development in infancy? 

The book starts with an introduction to the scientific method, and the nature of scientific 
explanation. It offers, however, no discussion of the relationship of description to theory, and its 
treatment of the problem of objectivity nowhere suggests that the categories used in the ‘objective’ 
description of behaviour will depend upon an investigator's initial theoretical framework. Within the 
discussion of particular research perspectives, by contrast, there are plenty of sensible comments. The 
problems of experimental design and interpretation are set out simply and readably, and considered 
in some detail, and a chapter on time-series design is clear and useful. However more detailed and 
complete discussion of the statistical issues considered here 1s already available for students in 
textbooks on statistics, and there are some notable gaps in the discussion of statistical methods. It 
would have been helpful, for instance, to have included more extensive discussion of the use of 
multivariate analysis techniques, with the assumptions required by such techniques discussed in 
relation to the kind of data on which studies of child development are usually based 

The balance of the book reflects the pattern of current American research in child psychology. The 
limitations, sources of bias, and problems of reliability involved in observational and naturalistic 
studies are heavily, and-very properly, stressed. But the conclusion that such studies are suitable only 
for the generation of hypotheses is less happy. Vasta shows no clear recognition that there are 
numerous problems which can be more fruitfully studied in natural situations, and that some 
problems of major developmental importance can only in principle be studied in such settings. For 
example, there is no mention of recent research on the pragmatics of communication and its place in 
the acquisition of language It is not a reassuring indication of the likely impact of the book on 
students’ ability to assess new theories and facts in a critical fashion that such studies could scarcely 
be carried out within the research guidelines which he recommends. 

There is a sensible brief discussion of the ethics of research, and the book ends with a description of 
professional societies and journals concerned with child psychology. Since those mentioned are 
exclusively North American, however, this chapter will not be of much value to an audience of 
British students. 

JUDY DUNN 


The Origin of Consciousness in the Breakdown of the Bicameral Mind. By Julian Jaynes London: 
Allen Lane. 1979 £8.95. 


This book 18 a brilliant display of thin-ice skating. Jaynes displays great dash and elegance in his 
circuit on the nature of evolution of consciousness, hoping (with some justification) that the applause 
which greets his feats will conceal the sound of ice cracking - thrills and near-spills for the interested 
reader. 

He starts with a discussion on the nature of consciousness which, from a Jamesian starting point, 
finishes by almost defining consciousness out of existence. In Jaynes' version consciousness (modern) 
seems to be something like attention. Then he leaps to the nature and evolution of language with a 
stunning set-piece on how language clothes conscious thought This argument for the necessity of 
language to thought is one of the best things in the book. Jaynes makes clear that language 1s neither 
master nor servant of thought, but rather a means for making thought accessible to the thinker. This 
1s preliminary to the real acrobatics which follow. 

Jaynes claims that at a relatively early stage in man's linguistic and cultural history (around 3000 
BC) modern, internalized, narrative consciousness did not exist, but that in its place was ‘unclothed 
consciousness' — observations, directions, which were experienced as heard commands Ancient man 
did not think to himself but was spoken to. These auditory hallucinations were experienced and 
reported as the voices of the gods (or God). This is a joltingly literal reapplication of the notion that 
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the gods are man-made, for Jaynes does not talk figuratively. The conclusion that ancient man had a 
bicameral mind, that he experienced a complete separation between the directive and executive 
components of conscious behaviour, is not derived from any logic that I can follow but 1s suggested, 
brilliantly, by the earlier chapters of the book It 1s at this stage, where the ice is thinnest, that Jaynes’ 
pyrotechnics are most spectacular. 

For example, he finds ıt necessary, having proposi the bicameral mind, to find a neurological 
home for ıt So he searches for a brain structure which is both relatively recent and apparently 
without clear function. He settles on Wernicke’s area in the right hemisphere. This leads to a number 
of contortions whereby he has to show that this area is capable of stimulating heard commands 
(extremely meagre and misinterpreted evidence from Penfield), that the RH is ‘Godlike’ in nature (it 
isn’t) and in which any current notions of right temporal function are, discounted. The reader may 
like to emulate Jaynes (and the White Queen ‘— why sometimes I believed as many as six impossible 
things before breakfast’) and try to justify at least one more cerebral ‘seat of the Gods’ (frontal lobes 
perhaps?) 

The bicamerality hypothesis, having been revealed at the end of the first of the book’s three 
sections, is elaborated in the following sections. In the first of these Jaynes traces the hypothesized 
course of the evolution of modern consciousness from pre-classical hallucinating bicameral mind into 
classical periods (1000 to 500 ac) when, following the breakdown of ancient hierarchical orders and 
natural catastrophes, cultural interchange generated a Babel of god-voices and the necessity arose for 
personal, internalized, ‘modern’ consciousness. Most of the evidence for this is extremely scanty and 
all of 1t open to other, simpler explanations. Yet I found the attempt at psycho-archaeology 
engrossing and important. Why shouldn’t a psychologist stake a claim in archaeological 
interpretation? Many people are more interested 1n how ancient people may have thought than in 
how their institutions functioned even if the two are interrelated 

In the final section Jaynes applies the insights of bicamerality to a variety of modern states of 
consciousness The section on schizophrenia is no more unsound than any other mono-theoretic 
view; his description of schizophrenic states as chemically induced hemisphere separation with an 
unlocking of interhemispheric control is close to that suggested by Dimond and by Paul Green (see 
Campbell & Heap, New Scientist, April, 1979). 

Jaynes turns his hand to the puzzles of hypnosis, as well. This ıs a beautifully written chapter, more 
Sober in style than much of the book, and very good indeed. It seems original too. Jaynes uses the 
phenomenological reality of the hypnotic state as the basis of an explanation — just as he does the 
(imagined) archaic mind. It will not do, therefore, to explain hypnosis away as a form of response to 
social control, as a sort of semi-aware play-acting. He argues, compellingly, that under hypnosis the 
processes that direct thought and action (god-like in the bicameral mind) are dissociated from 
executive function and are given over to the hypnotist. This chapter seems to me the best justification 
for the complicated discussion 1n earlier chapters where Jaynes attempts to draw a line between 
directive and executive conscious thought. 

Several things stand out after reading the book as a whole. Jaynes 1s a clear, racy writer with a 
good grasp of bis place 1n psychology and its history, yet perhaps not good enough Jaynes has no 
store of personal professional involvement in the matters in which he deals He is not a clinician nor a 
neuropsychologist nor an archacologist. I miss the authority of first-hand experience in his writing. 

I do not know whether it is this or some sort of West Coast 1960s narcissism that leads him to 
make a single personal experience of an auditory hallucination the cornerstone of his theory. The 
(engaging) admission of a lack of logic in the exposition of the bicamerality theory shows that he is 
happy to be relaxed about rational values; but I am not sure one can have it all ways. West Coast 
1ntuitions and East Coast, Ivy League rationality (Jaynes is a Princeton psychologist) may be good for 
the soul, but the switches in intellectual style and philosophy through the book are often hard to 
follow and may be there to obscure rather than clarify. 

Perhaps it is pointless to try to identify the speedily pirouetting figure and it 1s enough to relax and 
enjoy the spectacle — there are few such shows of dazzle and versatility 1n current psychology. I look 
forward to his next book. 

RUTH CAMPBELL 
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Sentence Processing: Psycholinguistic Studies Presented to Merrill Garrett. Edited by W. E. A. Cooper 
& E. C. T. Walker. Hillsdale, N.J.: Lawrence Erlbaum. 1979. Pp. 447. £18.50. 


This book consists of a collection of papers dedicated to Merrill Garrett. The chapters cover a variety 
of empirical issues in sentence perception (six chapters), a couple of topics 1n sentence production 
(two chapters) and a number of theoretical and philosophical issues (four chapters). The authors 
appear to have been given a free hand as long as their writings reflect Garrett's ‘own scientific stance’ 
in some way. The book would almost certainly have been better if the contributors had been 
constrained by a somewhat tighter editorial policy. As it is some of the chapters seem excessively long 
(viz: Ross & Cooper's exhaustive 76-page analysis of the use of like and other colloquial American 
terms), whilst others, such as J. D. Fodor's abstruse philosophical note of explanations of behaviour, 
bear virtually no relation to the title of the volume. The result is a series of chapters that vary a great 
deal in scope, approach and style. 

Fortunately, while the composition of the book can be criticized, some of the individual chapters 
are well worth reading. Particularly interesting is Forster's chapter on the relationship between 
subprocesses in comprehension: He argues that lexical, syntactic and semantic/pragmatic processes 
are carried out in succession and that the earlier operations are totally independent of subsequent 
processing. The suggestion contrasts with the currently more popular ‘interactive’ view of sentence 
processing. In defending his position Forster is forced to explain away a number of findings which 
apparently contradict it. This he does — sometimes persuasively and always energetically In some 
cases studies are shown to be inadequate on methodological grounds. In others he strives to preserve 
the ‘autonomy principle’ by arguing that the independent variable operates at a higher level than was 
originally supposed and, when all else fails, he simply proposes that there is an alternative way of 
carrying out the operation under consideration and that it is this newly introduced process that is 
subject to higher influences and not the lexical or syntactic procedures which are responsible for 
normal sentence processing. This argument is more convincing than it sounds, but at the end of the 
chapter one is left wondering whether anything really remains of the autonomy principle. 

Two other chapters in the book are concerned with the relationship between subprocesses in 
comprehension. In the first, Holmes uses data from a series of ambiguity studies to argue, contrary to 
the autonomy principle, that the syntactic structure assigned to a sentence can be modified if it 
generates an implausible interpretation for the sentence. In the second, Chodorow presents evidence 
that syntactic (but not lexical) processes may lag some time behind the input. This argues in favour of 
the autonomy of lexical processing and against certain kinds of top-down parsing. However, these 
conclusions can be queried on the grounds that Chodorow's paradigm (the recall of word strings and 
sentences presented in the form of time-compressed speech) lacks ecological validity. 

This problem is, of course, shared with many experimental techniques used to investigate sentence 
processing. However, there are signs that psycholinguists are at last beginning to take it seriously and 
ask questions about the effects of the demand characteristics of artificial tasks. Indeed, two of the 
chapters in this collection present detailed analyses of particular experimental techniques. These 
Chapters (Forster on the speeded classification task and Cutler & Norris on various on-line measures 
of senténce processing) will be useful to all researchers directly concerned with these techniques and it 
1s to be hoped that future writers will eventually extend the exercise to other widely used tasks. 

The remaining chapters cover a number of diverse issues 1n sentence processing. J. D. Fodor 
suggests a general parsing strategy for handling transformed sentences — namely, treating them as if 
they are the terminal string of a well-formed deep structure. It remains to be seen whether this 
“superstrategy’ is any more viable than the earlier suggestions on perceptual strategies. In two rather 
more empirical chapters, Wales and Toner present data which suggest that the intonation of a 
sentence can indirectly influence the course of syntactic processing and Bever and Townsend examine 
the factors influencing the depth of processing in main and subordinate clauses. In the section on 
sentence production, Lackner and Tuller present evidence that speakers can use proprioceptive 
information to detect speech errors as they are produced and Shattuck-Hufnagel uses speech error 
data to refine and extend earlier models of speech planning. Finally, Valian presents a clearly argued 
statement on the competence-performance distinction. 

It is clear from the style of the papers and the price of the book that it is intended primarily as a 
reference work for active researchers. Most investigators will find one or two chapters that are paN 
referring to and so it should serve this function adequately. 
D. C. MITCHELL 
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e Main coverage 1961—1978 but early and late 
additions appended 


* Comprehensive author index 


256 x 180mm 72pp 
ISBN 0 901715 10 7 
£5 50 

September 1979 


Models of Man 


edited by Antony J. Chapman and 
Dylan M. Jones (UWIST, Cardiff) 


Proceedings of a conference held in Cardiff, July 1 
in which some of the most eminent British psycholox 
discuss their varied approaches to the present-day s 
of man 


The distinguished contributors include 


H Beloff e D. E. Blackman e M. A. Boden e 
J. M Brener e D E. Broadbent e N. M. 
Cheshire e K J. Connolly e K. D Duncan e 
H. J. Eysenck e R M Farr e F. Fransella e 
A. Gale e J A. Gray e D. H. Hargreaves € 

R Harré eC | Howarth e G. Jahoda e 

M. Jahoda e R. B. Joynson e H. Kay e 

R P. Kelvin e P. Kling è A. J. Lock e 

E. Miller e R L Reid e V. Reynolds e 

J. Shotter e A. Still e G. Thinàs e D. Wallis 
e P. B. Warr e N. E. Wetherick e D. J. Wood 
€ D. S. Wrght 


210 x 150mm 432 pp. 
ISBN 0 901715 12 3 cased 
£12 50, cased, £7 50, paper 
September 1980 


0 901715 115 ; 





Bulletin 


Edited by John Wilding, Bedford 
College, London 

and Norman Worrall, Institute of 
Education, London 


The Society's Bulletin (available 
free to Members) reports on 
scientific and professional 
matters of interest to members, 
and caries book reviews, 
correspondence and advance 
announcements of national and 
international events 


Volume 33112 parts) £9 00 


Available ordy from the Socrety’s 
Leicester headquarters 


Other journal 

Invoicing and distribution by: 

The British Psychological 
Society 

The Distribution Centre 

Blackhotse Road 

Letchworth, Harts SG6 THN 


Journals 1980 * Journals 7980 * Journals 1980 » Journals 


The British Journal of Psychology 
Edited by Max Coltheart, Birkbeck College, London 


This journal reports mainly on empincal studies likely to 
bear on the understanding of genera! psychology 
Review studies and short progress reports are also 
included 

Volume 71 (4 parts) £35 00 (US$771 


The British Journal of Social and Clinical 
Psychology 


Social Psychology Editor G M Stephenson, 
University of Kent, Canterbury 

Clinical Psychology Editor H R Beech, 
Withington Hospital, Manchester 


The journal publishes empirical and review studies 
related to the general area of social psychology and to 
abnormal and clinical psychology together with investi- 
gations relating to behavioural analysis, personality 
dimensions and mechanisms and diagnostic assess- 
ment 

Volume 19 (4 parts) £33 00 (US$74) 


Special announcement 


As a result of the increased space demanded by the 
high rate of submission of good quality papers, The 
British. Journal of Social and Clinical Psychology will 
Split into 

The British Journal of Social Psychology 

The British Journal of Clinical Psychology 

Both will be available on the same subscription dunng 
1981 and they wnll become complately independent in 
1982 


The Britsh Journal of Medica! Psychology 


Edited by J P Watson, 
Guy's Hospital Medical School, London 


This journal publishes original contnbutions of kr 

ledge in the area of those aspects of psycho 

applicable to medicine and related clinical discip™ 
including the psychotherapies 

Volume 53 (4 parts) £28 00 (US$62) 


The British Journal of Mathematical and Statist 
Psychology 


Edited by P M Levy, University of Lancaster 


The joumal publishes articles which have a t 
reference to substantive psychological issues but ' 
8 greater mathematical or statistical or other fo 
aspect to their argument than i$ usually acceptabl 
other journals 

Volume 33 (2 parts) £25 00 (US$55) 


Journal of Occupational Psychology 
Edrted by Peter Warr, University of Sheffield 


This journal i$ concerned with all aspects of psycho 
and the human sciences in relation to human occ 
tions and organizations Occupational psycholoc 
interpreted in its widest meaning, covering also 
pnmary areas of industrial, engineenng and org: 
ational psychology 

Volume 53 (4 parts) £25 00 (US$55) 
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British Journal of Psychology (1980), 71, 447-448 Printed in Great Britain 447 


Other publications received 


ALLEN, F. H. Psychotherapy with Children. Lincoln, Nebraska: University of Nebraska Press. 1979. Pp 311. 
£9.60; £2.70, paper. 

AYLLON, T & MILAN, M A. Correctional Rehabilitation ana Management A Psychological Approach. New York: 
Wiley, 1979. Pp. 280 £11.50. 

ARTINANO, F. L. Causas psicosoctales del accidente de trabajo. Madrid: Linaza-Reyna. 1978 Pp. 493. 

ASHWORTH, P. D. Social Interaction and Consciousness. Chichester Wiley. 1979. Pp 227 £11.00 

Baarzy, C. & Verma, G K. Racial Prejudice, the Individual and Society. Farnborough, Hants. Saxon House. 
1979. Pp. 235 £8.50 

Baka, D A. Psychology & Medicine, London. Tavistock. 1979. Pp. 280. £3.95 

BALLOU, J.W The Psychology of Pregnancy Reconciliation and resolution. Lexington, Mass. D C. Heath. 1979. 
Pp. 143. £9.50. 

Barron, F. The Shaping of Personality, Conflict Choice and Growth. New York: Harper & Row 1979. Pp 359. 
£10.50 

Browne, B The Aetiology of Masochism Stockport: Cooperative Motivation Research 1979, Pp 38. £3 25. 

Bourng, R. & NEWBERGER, E. H. Critical Perspectives on Child Abuse. Lexington, Mass: D C. Heath. 1979 Pp. 
224. £11 50. 

Burks, J. & KUPAR, M. Temperament Styles m Adult Interaction New York. Brunner/Mazel. 1979. Pp. 240. 
$15.00. 

Buss, A. R. A Dialectical Psychology Irvington’ New York. 1979. Pp. 2h1 £11.50 

Buss, A. R. (ed.). Psychology in Social Context Irvington. New York. 1979. Pp. 407. £14 95. 

Byerts, T. O , HOWELL, S. C. & PASTALAN, L A. (eds). Environmental Content of Aging’ Life-styles, 
Environmental Quality, and Living Arrangements. New York: Garland STPM Press. 1979. Pp 237 £22 50 

Cape, C. H. & CoxuzaD, N. The Awakened Mind London: Wildwood House 1979 Pp 275 £3.95. 

CHATEAU, J. (ed.) La Psychologie de L'Enfant en Langue Francaise. Toulouse Private Publication. 1979. Pp. 284 

CHRISTMAN, R. J Sensory experience, 2nd ed. New York. Harper & Row. 1979 Pp. £10.35. 

Cook, M. Perceiving Others. London: Methuen 1979. Pp. 180. £6.25; Paper, £3.25. 

Dacey, J.C Adolescents Today. Santa Monica, Calif.: Goodyear Publishing. 1974. Pp 442. £10 35 

Daly, A. Why Women Fail. London: Wildwood House. 1979. Pp. 112. £5.95. 

DASEN, P., INHELDER, B., LAVALLEE, M. & RETSCHITZKI, J Naissance de L' Intelligence Chez L'Enfant Baoulé de 
Cóte D'Ivoire. Berne: Huber. 1978. Pp. 324. DM48. 

DooNA, M. A. Intervention m Psychiatric Nursing, 2nd ed. Philadelphia Pa.: Davis. 1979. Pp 281 $13 50. 

Evans, R- I. Jung on Elementary Psychology. A Discussion between C. G Jung and Richard I. Evans. London: 
Routledge & Kegan Paul. 1979 Pp. 241. £295. 

Fme, R. The Intimate Hour. New Jersey: Avery. 1979. Pp. 318. $13 95. 

Foucaurr, M. Discipline and Punish The Birth of the Prison. Harmondsworth, Middx: Peregrine Books 1979 
Pp. 333. £2.95. 

GATCHEL, R. J. & Price, K. P. (eds). Clinical -Applications of Biofeedback Appraisal and Status. New York: 
Pergamon Press. 1979. Pp. 287. $10.45. 

GLENNON, L. M. Women and Dualism A Sociology of Knowledge Analysis New York. Longman. 1979. Pp 250 
£5 95. 

GOTTESFELD, H. Abnormal Psychology A Community Mental Health Perspective Chicago Science Research 
Associates. 1979. Pp 533. £11.85. 

GOULDING, M. M. & Goutpina, R. L Changing Bes through RECON Therapy. New York: Brunner/Mazel 
1979, Pp. 312. $15.00, 

Gray, B. & Isaacs, B. Care of the Elderly Mentally Infirm London: Tavistock. 1979. Pp. 213. £8 95, paper, £4 50 

GREEN, A. The Tragic Effect. The Cede Complex in Tragedy. Cambridge: Cambridge University Press. 1979 
Pp. 264 £10.50. 

GREENWOOD, J. W , III & GREENWOOD, J W., INR. Managing Executive Stress A Systems Approach New York 
Wiley. 1979. Pp. 255. £11.00. 

Harpy, M. & Heyes, S Beginning Psychology London: Weidenfeld and Nicolson. 1979. Pp. 228. £6.50, paper, 
£2.95. 

INEICHEN, B. Mental Illness. London: Longmans. 1979 Pp. 112. £2.95. 

KASVATUS, Z. The Finnish Journal of Education 1979. Helsinki. Pp. 61. 

KELLEY, H. H. Personal Relationships: Their Structures and Processes. Hillsdale, N J.. Lawrence Erlbaum. 1979. 
Pp 183. £10.00. 

Kuan, M. M. R. Alienation in Perversions (IPAL no. 108). London: The Hogarth Press, 1979. Pp. 245 £8.95 

Kuan, V. S (ed.). Minority Families in Britain. Support and Stress. London: Macmillan. 1979. Pp 203. £12.50; 
paper £4.95. 

Krantz, D. L. Radical Career Change: Life Beyond Work. New York: The Free Press. 1979. Pp. 157 £7.45. 
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Krantz, D. L Radical Career Change: Life Beyond Work. New York: The Free Press 1979 Pp. 157 £7.45. 

LAMB, W. & WATSON, E Body Code The Meaning in Movement. London Routledge & Kegan Paul. 1979 Pp 
190. £5.95. 

Lyons, J. & BARRELL, J People: an Introduction to Psychology. New York: Harper & Row. 1979. Pp 535. £7.10. 

McConDuck, P. Machines Who Think San Francisco, Calif.. Freeman. 1979 Pp 375 £780 

McKay, C. & Cox, T. (eds) Response to Stress: Occupational Aspects. London: IPC Science & Technology Press. 
1979. Pp 224. £10 00 : 

McKzrLan, P. Mindsplit- The Psychology of Multiple Personality and the Disorganised Self. London. Dent 1979. 
Pp. 188 £7.95 

MEDNICK, S. A & SHOHAM, S G. (eds) New Paths In Criminology Lexington, Mass.’ D. C. Heath. 1979. Pp. 235 
£12.50 

MITTLER, P. People not Patients: Problems and Policies m Mental Handicap London. Methuen. 1979. Pp 238. 
£4.50. 

Mussen, P H., CoNGER, J. J. & KAGAN, J Child Development and Personality, Sth ed New York: Harper & Row 
1979. Pp 579. £9 70, paper, £6.95 

Mussen, P H , Concer, J J., KAGAN, J & Gain: J. Psychological Development A Life-Span Approach New 
York: Harper & Row 1979. Pp 502 £800 

PFOHL, S. Predicting Dangerousness The Social Construction of Psychiatric Reality. Lexington, Mass 
D C. Heath 1979, Pp. 250 £12 50 

POMERLEAU, O. & BRADY, J P. (eds). Behavioural Medicine Theory & Practice Baltimore: Williams & Wilkins. 
1979 Pp 308. $22.00 

PRESSMAN, R Private Practice A Handout for the Independent Mental Health Practitioner. New York: Gardner 
Press 1979, Pp 230 £11.50. 

PRIESTLEY, D. Tied Together with String A Two-Year Study of Care for the Schizophrenic. Surbiton, Surrey 
National Schizophrenic Fellowship. 1979 Pp 83. £2.00 

Privat, E (ed) L'Evolution Psychiatrique, vol. 44, pt 2. 1979 Pp. 111. 

RiMM, D C & Masters, J. C. Behaviour Therapy, 2nd ed London. Academic Press. 1979 Pp 538 £10.70 

RoBINSON, D. (ed.). Alcohol Problems London Macmillan. 1979 Pp 254 £4.95 

RoBiNSON, N. M., RoBiNSON, H. B., DARLING, M. A & Horm, G A World of Children Daycare and Preschool 
Institutions Monterey, Calif.. Brooks/Cole. 1979 Pp 250 $895 

Rossi, P H , FREEMAN, H E & WRIGHT, S. R Evolution. A Systematic Approach Beverley Hills, Calif. Sage. 
1979. Pp. 236 £8.00. 

SAGAN, C. Broca's Brain. London: Hodder & Stoughton. 1979. Pp 347 £695. 

SHARMAN, K. J. Children’s Television Behaviour Its Antecedents and Relationship to School Performance 
Occasional Paper No 14 Hawthorne, 3122 Victoria: The Australian Council for Educational Research. 1979 Pp. 
84 

SHELEFF, L S. The Bystander Behaviour, Law, Ethics Lexington, Mass: D C. Heath 1979 Pp. 223 £11 50 

SunoHAM, S. G. Salvation through the Gutters Deviance and Transcendence. Washington. Hemisphere. 1979. Pp 
275. $11 95 

SONSTEGARD, M, SHUCK, A & BEATTIE, N. R Living in Harmony with our Children Luton, Beds. Millford 
Reprographics 1979. Pp 45 £1 00. 

SPIELBERGER, C Understanding Stress & Anxiety. London Harper & Row. 1979 Pp 128. £491, paper, £1 91 

STRICKLAND, L H Soviet and Western Perspectives in Social Psychology. Ottawa, Canada: Pergamon 1979 Pp 
220 $3000 

Sutton, C Psychology for Social Workers and Counsellors An Introduction London’ Routledge & Kegan Paul, 
1979. Pp. 230 £7.95; paper, £4 95. 

TAYLOR, C The Explanation of Behaviour London Routledge & Kegan Paul 1980 Pp 278. £3.75 

Turre, C & MYERHOFF, B. Changing Images of the Family New Haven Yale University Press Pp. 403. £12.30 

WARD, C The Child ın the City Harmondsworth, Middx Penguin Books. 1979 Pp 221 £295. 

Werner, E. E. Cross-cultural Child Development A View from the Planet Earth. Monterey, Calif.. Brooks/Cole. 
1979 Pp. 355 $1095 

WOLFGANG, A. (ed ) Nonverbal Behaviour New York: Academic Press. 1979 Pp 225 £780. 

Woouams, S. & Brown, M. T. A. The Total Handbook of Transactional Analysis Englewood Cliffs, N.J . 
Spectrum 1979 £9 70, paper, £4 50 


Oxford Books for Students 


Memory, Thought, and Behavior 
Robert W. Weisberg 


This book is concerned with the acquisition and use of knowledge. 
It focuses on: memory, problem solving, language use and language 
development, the development of thought, and the question of the 
medium of thought. £11 


Educational Psychology in the 


Classroom 


Henry Clay Lindgren l 

The aim of this textbook is the clear presentation of the basic principles 
of educational psychology and their application in the classroom. 

Sixth edition paper covers £9.25 


A Practical Guide to Behavioral 
Research 


Robert Sommer and Barbara B. Sommer 


This book provides a multidisciplinary, comprehensive study of behavioural 
research. Numerous practical exercises are included at the end of each 
chapter. £7.50 paper covers £4.50 


The Self in Social Psychology 
Daniel M. Wegner and Robin R. Vallacher 


The recent appearance of several intriguing social psychological theories 
signals a strong new interest in the concept of the self. These theories, 
from self-perception to impression management to objective self- 
awareness span the entire range of social psychology and offer insights 
into topics such as cognition, motivation, social interaction, emotion, 
and attribution. Paper covers £4.25 


Memory and Cognition: 
An Introduction 


John G. Seamon 


This is an introductory textbook for undergraduates studying memory. 
Paper covers £7.95 


A Practitioner's Guide to 
Rational- Emotive Therapy 


Susan R. Walen and others 
Paper covers £5.50 


Oxford University Press 


Brit. Jnl. of Psychology, 71, 3 (i) 

















Language Production 


Volume 1: Speech and Talk 


Edited by B. Butterworth : 
May/June 1980, x + 478pp., £28 00 (UK only) / $64.50, 0.12.147501.8 


This and the forthcoming volume will bring together al! the important research on 
language production from the fields of adult and developmental psychology, 
diachronic and synchronic linguistics, phonetics, sociology, neurology and artificial 
intelligence. Volume 1 focuses on adult speech. A survey of research methods 
introduces the subsequent reviews which cover the social setting of talk, the 
psychological processes involved in planning and organizing utterances and the neuro- 
muscular processes that give rise to articulation. 


Strategies of Representation 
in Young Children 


Analysis of Spatial Skills and Drawing Processes 


Norman H. Freeman 
April 1980, xiv + 392pp., £20.60 (UK only) / $47.50, 0.12.264750.5 













This is the first book for thirty years to examine the nature of drawing strategies used 
by children in relation to their perception of spatial relationships and the development 
of general spatial skills. 


Social Interaction and Cognitive 
Development in Children 


Anne-Nelly Perret-Clermont 
March 1980, viii + 208pp., £14.80 (UK only) / $34.00, 0.12.551950.8 


The thesis of this book is that it is cognitive coordinations between individuals which 
are the foundation of individual cognitive coordination. To illustrate the thesis, 
research is used which shows how, at certain levels of cognitive development, collective 
performances are superior to individual ones. The author shows convincingly that 
structuring in social interactions plays a part in subsequent individual cognitive 
structuring and how the social conflict of viewpoints plays an important role in the 
elaboration of new cognitive coordinations. 


Aspects of Consciousness 


Volume 1: Psychological Issues 
edited by Geoffrey Underwood and Robin Stevens 
1979, xvi + 252pp., £12.40 (UK only} / $29.00, 0.12.708801.6 


This volume of Aspects of Conciousness brings together a variety of discussions on 
mental processes. They range across a broad spectrum of topics which are of current 

interest in psychology. There are chapters on traditional subjects such a memory and 

thinking, discussions on relationships between sensory deprivation and consciousness, 
hypnosis which can be considered an altered state of consciousness and time perception. 
Theoretically interesting aspects of consciousness are also covered in chapters on the 
development of consciousness and absent-minded behaviour. 





12 A Subsidiary of Harcourt Brace Jovanovich, Publishers 
Academic London New York Toronto Sydney San Francisco 
P 24-28 Oval Road, London NW1 7DX, England 

ress 111 Fifth Avenue, New York, NY 10003, USA 
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Oxford Journals 
BRAIN 


A Journal of Neurology 


Contains original papers in clinical neurology and related disciplines, and in the 
basic neurological sciences where they are relevant to clinical problems. Many of the 
papers appearing in past issues have become classics in the field. 





Recent and forthcoming articles include: 

The Impact on Neurology of 40 years’ Advances in Pharmacology. Marthe Vogt. 
Quantitative Analysis of Stance in Late Cortical Cerebellar Atrophy of the Anterior 
Lobe and Other Forms of Cerebellar Ataxia. K. H. Mauritz, J. Dichgans and 

A. Hufschmidt. i 


Mapping of Local Cerebral Functional Activity by Measurement of Local Cerebral 
Glucose Utilization with [14C] Deoxyglucose. Louis Sokoloff. 


Constant Error of Visual Egocentric Orientation in Patients with Acute Vestibular 
Disorder. Gunnar Hornsten. 


Memory Disorder in Korsakoff’s Psychosis: a Neuropathological and 
Neuropsychological Investigation of Two Cases. W. G. P. Mair, E. K. Warrington 
and L. Weiskrantz. 


Plasticity in Speech Organization Following Commissurotomy. Michael S. Gazzaniga, 
Bruce T. Volpe, Charlotte S. Smylie, Donald H. Wilson and Joseph E. LeDoux. 


Word-form Dyslexia. Elizabeth K. Warrington and Tim Shallice. 


Neurochemical Alterations in Huntington’s Chorea: a Study of Post-mortem Brain 
Tissue. E. G. S. Spokes. 


The Basic Uniformity in Structure of the Neocortex. A. J. Rockel, R. W. Hiorns 
and T: P. S. Powell. 


Abnormal Anticipatory Postural Reflexes in Parkinson’s Disease. M. M. Traub, 
J. C. Rothwell and C. D. Marsden. 


Published quarterly (March, etc) 


1980 prices: £22.00 (UK £19.00, US $46.00) p.a. 
Single issues £6.00 ($13.00) 


Oxford University Press 
(iii) 





Oxford University Press 





Seeing: Illusion, Brain, and Mind 
John P. Frisby 


‘| recommend this book for students of psychology and 
physiology, as well as for anyone interested in recent views on 
how we see. It is factual, accurate, clearly written, and extremely 
well produced.’ Richard Gregory in The Times Educational : 
Supplement. ' It offers superb illustrations and innumerable 
illusions for a modest price.' New Scientist. Illustrated £6.95 


An Introduction to Behavioural 
Geography 
John R. Gold 


. This book introduces the reader to a rapidly growing area of 
inquiry. It examines the ways in which-psychologists and © 
geographers have previously viewed human behaviour, considers 
how people come to terms with the spatial environment, surveys 
current knowledge about the nature and characteristics of 

spatial cognition, and investigates the links between cognition 
and behaviour. £12.50 paper covers £5.95 


The Evolution of Human 
Consciousness 


J. H. Crook 


This book explores the evolution of the human mind. The origin 
of human cognition is considered and'the sources of human 
behaviour and the social environment they have created are 
examined, the nature and. origin of the distinctly human person 
is discussed, and aspects of therapy that can help the personality 
in its natural growth towards autonomy are considered. £15 


Frames of Mind 


Constraints on the Common Sense Conception of the Mental - 
Adam Morton 

The author argues that there is a stream of improvised devices 
for explaining actions and describing and imagining states of 
mind, held together only by largely unstated constraints on what 
can or cannot be improvised. The main aim of his book is to 
begin stating these constraints on how we can conceive of 
ourselves. Illustrated £8.50 


Gv) 


Oxford University Press 
Clinical. Neuropsychology 


Edited by Kenneth M. Heilman and Edward Valenstein 


This textbook deals with the behavioural and intellectual 
disorders that clearly have a neurological origin. It gives a 
comprehensive clinical description. of the major neurobehavioural 
disorders and discusses their pathogenesis. It offers a clinical 
approach to the study of brain—behaviour relationships, and will 
be of special interest to neurologists, clinical psychologists, and 
psychiatrists. £14 Oxford Medical Publications 


Interviewing and Patient Care 
Allen J. Enelow and Scott N. Swisher 


A reviewer of the first edition wrote: ‘It is an outstanding book 
on the doctor-patient relationship . . . there can be few doctors 
who would not learn a great deal from this book.’ 

Second edition paper covers £4.95 


Psychophysiology 


Human Behavior and Physiological Response 
John L. Andreassi 

Research activity in psychophysiology is aimed primarily at 
gaining, through non-invasive and impermanent means, a 
greater understanding of the function of the human organism, 
but the applications of the research are many. Illustrated 
paper covers £4.75 


The Foundations of Primitive Thought 


C..R. Hallpike 


The object of this book is to elucidate as rigorously as possible 
the characteristics that seem most distinct and prevalent in the 
thought processes of primitive peoples. Using the principles of 
development psychology as worked out by Piaget and others, 
the author tackles many of the issues that have plagued 
anthropologists since the beginning of the century. £17.50 
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2 NEW JOURNALS FROM 
NORTH-HOLLAND PUBLISHING 
COMPANY 


BEHAVIOR RESEARCH OF SEVERE 
DEVELOPMENTAL DISABILITIES 


Editor: P. M. SMEETS, University of Leiden, The Netherlands 


BRSDD publishes original papers devoted to the behavior analysis and rehabili- 
tatlon of severe developmental disabilities, such as profound and severe mental 
retardation, autism and childhood psychosis, aphasia, deafness, blindness, epilepsy, 
cerebral palsy and other gross physical defects. Since the functional analysis 
of behavior 1s the unifying conception of this Journal and since the disabled 
or disabling behaviors may be related to a diversity of conditions, published 
reports will include aspects which will be of critical Importance to disciplines 
such as psychology, psychiatry, education, behavioral medicine, remedial 
teaching, occupational and physical therapy, nursing and speech pathology 
The journal contains research studies, discussion papers and technical reports 


1980: Volume 1 in 4 issues. Subscription price: US $69.75/Dfl. 136.00. 
Private subscribers are entitled to a subscription at the reduced rate of 
US $30.75/Dfl. 60.00. 


FRENCH-LANGUAGE PSYCHOLOGY 


Journal of Abstracts and Reviews 


Editor: PAUL FRAISSE, University of Paris, France 
Associate Editor: MADELEINE LEVEILLE, C. N.R.S , Paris, France 


French-Language Psychology provides up-to-date information for English-speaking 
psychologists on the literature and current trends of psychology in French- 
speaking countries (principally Belgium, Canada, France and Switzerland) 
All books and articles (ranging from experimental psychology to psycho- 
analysis) published in French since January 1, 1979 are either mentioned, 
abstracted or reviewed. The most salient publications are presented in full- 
page summaries, including tables and figures where relevant In addition, 
general articles concerning the progress and current state of psychology in 
French-speaking countries are published Each issue also contains an index, 
and a cumulative subject index is published annually in the last issue 


1980: Volume 1 in 4 issues. Subscription price: US $57.00/Dfl. 111.00. 
Private subscribers are entitled to a subscription at the reduced rate of 
US $33.75/Dfl. 66.00 


Personal subscriptions must be prepaid, the orders must be placed directly with 
the Publisher and copies should not be made available to institutions 


All prices include postage costs. In case of currency fluctuation, the Dutch 
guilder price is definitive 


Orders and requests for specimen copies may be sent to the Publisher 


NORTH-HOLLAND PUBLISHING COMPANY . E c 


2 P.O.Box 211-1000 AE Amsterdam - The Netherlands. 
$ 52 Vanderbilt Avenue, New York, NY 10017, U.S.A. 
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GENERAL PSYCHOPATHOLOGY 


An Introduction 


CHRISTIAN SCHARFETTER 
Translated by HELEN MARSHALL 


This introduction to psychopathology arose from years of contact with In-patients of 
a mental hospital and teaching psychopathology to undergraduate and graduate 
students. Professor Scharfetter seeks to emphasise those aspects of traditional 
psychiatry that have proved useful for understanding the ‘behaviour’ of the mentally 
disturbed and as a basis for treatment. The text defines the basic phenomena of 
mental disorders, and discusses the various contributions to the understanding of 
these disorders, that have been derlved from phenomenology, experimental and 
clinical psychology, neurophysiology and psychodynamic theory. The chapter on 
consciousness which Is seen as the medium of every psychopathological experience 
is of central Importance, while the concept of five basic dimensions of 
consciousness offers a new systematic approach to the schizophrenic ego-disorder. 


This text will be invaluable to students taking courses in psychiatry and psycho- 
pathology at an introductory level in medical schools and psychiatric Institutes 

Hard covers £20.00 net 

Paperback £6.95 net 


CAMBRIDGE UNIVERSITY PRESS 


ARCHIVIO DI PSICOLOGIA NEUROLOGIA E PSICHIATRIA 
rivista trimestrale pubblicata a cura dell'Università Cattolica del Sacro Cuore 
direzione Leonardo Ancona 


PSICOLOGIA: direttore Giuseppe Girotti; comitato di redazione Luigi Anolli, Anna Mana Pati, Eu a Scabini 
NEUROLOGIA, dvettore Giorgio Macchi; comitato di redazione Paolo Bergonzi, Guido Gainotti, Pietro Tonali 
PSICHIATRIA, direttore Leonardo Ancona; comitato di redazione Filippo Ferro, Giovanni Guerra, Corrado Pontaiti 


: segretario di redazione Carlo Saracen: 
Anno XL 1979 Fasc. 4 
SOMMARIO 
ARTICOLI 
M. MoLmuni —- M. BENTIVOGLIO - D. MiNCIACCHI — G. Macchi. Le metodiche di trasporto assonale 
nello studio sperimentale delle connessioni del sistema nervoso 
A. ALBANESE - M. MOLARI — M. BENTIVOGLIO. Prospettive nello studio delle proiezioni new onali. 
l'uso combinato di tracciant: retrogradi con metodiche istochimiche per 1 'identificazione dei 
neurotrasmettitori 


LANDE — E. TEMPESTA, Il ruolo dei nuclei del rafe nella regolazione del sonno e nel controllo del 
olore 


G. VILLA — A. L. ABBAMONDI. Encefalopatie spongiose subacute trasmissibili. revisione critica degli 
aspetti clinici, neuropatologici e dei problemi etiopatogenetici 
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Intelligence as an information-processing concept 


Earl Hunt 


^ 
a 





Attempts to relate information-processing capacities to intelligence test scores have had modest success 
when normal subjects are used but have been very successful in showing differences between extreme 
groups in information-processing capacity This is illustrated by reviewing studies relating memory to 
verbal comprehension. The reason for these disparate results may be that in normal subjects 
individual competence may depend largely on differences in choice of a problem-solving strategy. 
Using sentence verification as an example, it 1s shown that information-processing and psychometric 
measures are in much closer correspondence when account is taken of one's problem-solving strategy 
This, however, still leaves us with the problem of explaining positive manifold, i.e. the psychometric 
evidence for a concept of general intelligence. The relation between g and the idea of attentional 
resources is explored, by applying the dual task methodology to memory and reasoning tests. The 
results support the notion that g can be related to attentional resources. 





My son's high school biology text begins with a chapter entitled ‘The meaning of life’. 
After he and his fellow teenagers have mastered this, they move on to Ch. 2, ‘The 
diversity of life’. Is there a message here for those who would explain intelligence? How 
can we speak about who thinks, or who thinks well, until we have a clear picture of what 
thinking means to us? Viewed another way, if we think that we have a good theory of 
thought, then we should be able to use that theory to describe individual differences. This 
point has been made before (Underwood, 1975). Rather than repeat the argument, I shall 
try to develop it further. What progress has been made by using information-processing 
theories to understand individual differences in cognition? More interestingly, where is our 
progress stymied, why, and what can we do about it? 

A naive, but common, way of studying individual differences in cognition is to establish 
a statistical relationship between performance on psychometrically defined intelligence tests 
and performance on more theoretically defined laboratory tasks. Investigators who do this 
are usually not interested in the intelligence test itself. Their goal is to locate a relationship 
between theoretically motivated measurements of behaviour and competence in practically 
important situations. The straightforward way to proceed is to find the correlation between 
a theoretical measurement and individual performance in the extra-laboratory situation 
Examples of such studies are Love’s (1977) finding that measures of the ability to retain 
arbitrary associations predicted the number of runs computer programmers required to 
debug a program, and Gopher & Kahneman’s (1971) finding that measures of the ability to 
switch attention from one channel to another predicted the accident records of Israeli bus 
drivers. In both of these examples the task to be studied was defined precisely, the criteria 
of performance were clearly stated, and the conditions of work were reasonably similar 
across individuals. It is much harder to establish the cognitive correlates of performance in 
more broadly defined situations, such as ‘success as a lawyer’, because the criteria are not 
clear and because situational factors may be as or more important than individual 
characteristics 1n determining success. 

An alternative way to study the cognitive basis of performance is to make use of the fact 
that psychometric aptitude tests, which are easily given in standardized situations, have 
been shown to correlate with very broad measures of success in a variety of settings. For 
instance, several studies have shown that verbal aptitude test scores are among the best 
predictors of grades in colleges and universities (Willerman, 1979). It has also been found 
that scores on tests of logical reasoning, numerical aptitude, and spatial reasoning 
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differentiate people who enter various occupations, such as accounting, law, or engineering, 
even though the tests are taken prior to the persons entering professional training and are 
not part of the selection criteria that determine admission to training (Thorndike & Hagen, 
1959). Given their record of statistical association with ‘success’, aptitude test scores can 
serve as surrogates for the unobtainable measures of competence outside the laboratory. 
The argument ıs that a link between cognitive theory and extra-laboratory competence can 
be established by linking both to psychometric test performance. 

Within this framework there have been attempts to link performance on tests designed to 
evaluate verbal aptitude with tasks ihat are supposed to tap some pure aspect of memory 
(Hunt, 1978a). This is an attractive endeavour because of the central role of verbal 
processes 1n our culture and because of the prominence of memory in our theories of 
cognition. While the purpose of this paper is to raise questions about intelligence and 
information-processing in general, most of the examples will be based on the study of 
memory and verbal intelligence, simply because we know more about this point on 
the interface between psychometrics and experimental psychology. i 


The current status of the effort 


Saying that the approach of correlating laboratory performance to test scores is perhaps 
too harsh. The experimental paradigms used typically yield parameters that estimate some 
theoretically basic information-processing function, e.g. speed or accuracy of access to 
information in short- or long-term memory. It is certainly of interest to determine whether 
or not those people who are facile with linguistic reasoning differ from less facile persons 
along such dimensions of information-processing. Indeed, one of Underwood’s (1975) 
points was that the failure to find that there are differences between more and less 
competent individuals on any of our information-processing parameters should be cause for 
serious rethinking of our theories. 

How much progress has been made in establishing links between theories of memory and 
individual verbal aptitude measures? The answer to this question is ‘Some, but surprisingly 
little’. A common problem keeps resurfacing; very small differences in information- 
processing are found between ‘high verbal’ and ‘low verbal’ subjects within the normal 
range of intelligence, but substantial differences are found if we move to the study of 
extreme groups, such as mental retardates. To see this, let us examine the findings in three 
areas: access to well-learned material, access to recently presented material, and learning. 

Asymptotic memory access refers to the speed with which we can retrieve highly 
overlearned associations. A useful technique for testing the speed of an asymptotically 
learned linguistic association is the stimulus identification paradigm developed by Posner & 
Mitchell (1967), and since used by many others. Figure 1 illustrates the procedure Two 
letters are presented, and the task is to indicate whether or not they have the same name or 
different names. Concentrating our attention on ‘same’ trials, letters may be either 
physically identical (PI), as in the pair A-A, or name identical (NI), as in the pair A-a. The 
reaction time (RT) for identification of NI pairs is greater than the reaction time for 
identification of PI pairs. The difference between RTs for NI and PI pairs, which will be 
called the NI-PI measure, can be regarded as a measure of the efficiency of retrieval of a 
highly overlearned linguistic association.* Note that using the NI-PI measure does not 
commit one to the assumption that physical identification inevitably precedes name 


* It ts important that some method of controlling for motor reaction time be introduced into experiments of this 
sort. Most of the variance in reaction times in stimulus identification studies 1s associated with simple choice 
reaction times, including the time required to move the fingers. Negative results are quite likely in studies that fail 
to control for this effect (e.g Hogaboam & Pellegrino, 1978). 
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= Fixation point 


Subject responds 
‘same’ if letters have 
the same name 


or 
‘different’ if letters have 
different names 
Figure 1. The stimulus identification paradigm. The first pair exemplifies the physical identity 
condition (PI), the second pair the name identity (NI) condition, and the third pair the different 
condition. 


identification, but simply to the assumption that name identification is more dependent on 
linguistic associations than is physical identification (Posner, 1978). 

Since reading requires that we make arbitrary connections between visual symbols and 
linguistic concepts, one can reasonably hypothesize that the process of finding letter names 
should be related to the ability to use the written language, which is what a verbal aptitude 
test tests. This hypothesis fares moderately well when we examine studies using subjects 
within the normal range of intelligence. As part of a study of the information-processing 
correlates of intelligence in college students, Lansman et al. (in preparation) determined the 
correlations between the RTs for the NI and PI conditions, the NI-PI measure, and a 
number of tests selected to identify the factors of Cattell's (1971) theory of fluid and 
crystallized intelligence. Table 1 shows the factor loadings for the three reaction-time 
measures. The NI-PI measure has a moderate (0-29) but reliable loading on Cattell’s 
‘crystallized’ (Gc) factor, which is identified by tests requiring the use of highly 
overlearned, usually verbal information (Horn, 1979). By contrast, the PI measure alone 
emerges as a measure of Cattell’s ‘clerical and perceptual speed’ factor, which is identified 
by tests that require visual comparisons of letters without consideration of their meaning. 
These results are consistent with a number of other studies which have shown a correlation 
of about —0-30* between the NI-PI measure and verbal aptitude scores. Warren (1978) 
found a correlation of —0-34 between NI-PI and WISC verbal scores in grade school 
children. There have also been several studies of extreme groups in which it has been found 
that ‘high’ verbal aptitude scores are associated with small NI-PI measures, and ‘low’ 
aptitude scores with longer measures, even though both extreme groups are within the 
normal to above average range (Hunt et al., 1975; Goldberg et al., 1977; Keating & 
Bobbitt, 1978). In all cases, though, the absolute difference is relatively small. For instance, 
Hunt ef al. report that in contrasting high and low verbal college students the mean 
difference in the NI-PI measure is about 30 ms, in a task in which the mean NI RT is 
between 550 and 600 ms. 

* In general, correlations between reaction-time studies and test scores should be negative, as long RTs reflect 
poor performance. TN 
16-2. 
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We can make a much more dramatic case for an association between verbal intelligence 
and access to long-term memory for verbal codes if we examine the results obtained from 
studies of groups that span the whole range of mental competence. Figure 2 summarizes 
several such studies, covering populations ranging from exceptionally bright college 
students to educable mental retardates. Values of the NI-PI measure range from 60 to over 
350 ms. There is also considerable non-linearity in the relation between the NI-PI measure 
and estimates of ‘general intelligence’. There is roughly a 30 point IQ spread between 
bright university students and average young adults. and a similar IQ spread between 
young adults and educable mental retardates. The results shown in Fig. 2 show that the 


Table 1. Correlations between measures in stimulus identification task and factors defined 
by Cattell-Horn theory of intelligence 








Crystallized Fluid Clerical-perceptual 
Measure intelligence intelligence speed Visualization 
Physical identity RT 0-02 0-19* 0-33* 0:02 
Name identity RT 0-15 0-21* 0-36** 0-02 
NI-PI 0-29** 0:15 0 25** 0-01 





* P<0-05; **P <0-01. 
a Abstracted from Lansman et al. (in preparation). 
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Figure 2. Mean difference between name identity and physical identity RTs for groups varying in 
intellectual abihty. 


equal difference in ‘IQ points’, which is basically a statistical concept, is not paralleled by 
an equal difference in our estimates of the efficiency of memory. ..an information- 
processing concept. There seems to be little prior reason for preferring one or the other of 
these scales as a measure of mental competence, so it would be hard to argue that the 
non-linear relation indicates that either scale is wrong. 

Since language is a serial method for presenting information, we must have a short-term 
memory capacity in order to comprehend linguistic messages. To what extent are 
individual differences in short-term memory efficiency associated with more globally 
defined differences in verbal aptitude? There are two aspects to this question; how rapidly 
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can we access information already in short-term memory, and how much information can 
we hold there? 

Short-term memory access speed is usually measured by memory scanning experiments 
(S. Sternberg, 1966), as illustrated in Fig. 3. The observer is shown from one to six letters 


Memory set 1,3,5,7 

Probe 76? 

Answer No 
RT 


Memory set size 


Figure 3. The memory scanning paradigm. In the example at the top of the figure, the subject is first 
presented with the memory set ‘1, 3, 5, 7’. The probe item ‘6’ is then presented and the subject is to 
respond as to whether ‘6’ was a member of the memory set. The graph below illustrates the typical 
finding that RT is a linear function of the size of the memory set. 


or digits, called the memory set, and then is shown a probe stimulus. The task is to 
indicate whether or not the probe was a member of the memory set. RT to make this 
decision is found to be a linear function of the number of items in the memory set, and the 
slope of this function is considered a measure of speed of access to information in 
Short-term memory. While there are some reports of correlations between memory scanning 
rate and verbal intelligence in normal subjects, the relations are neither large nor 
consistent. Chiang & Atkinson (1976) even reported that the direction of the relationship 
was different in men and women! Our present knowledge supports S. Sternberg's (1975) 
earlier conclusion that there are individual differences in memory scan rates, but that their 
relation to other characteristics of the person is not clear. Once again, though, the picture 
changes when we examine results from extreme groups. Figure 4 shows results from a 
number of such studies. There is more than a 10 to 1 difference between the fastest 
memory scanning reported (by Hunt & Love, 1972, for an expert mnemonist) and the 
slowest reported (by Harris & Fleer, 1974, for mental retardates with suspected brain 
damage). 

À similar pattern appears if we change from studies that examine the speed of access of 
short-term memory to studies that examine its size, using conventional memory span 
procedures. There are reliable individual differences in memory span that are not associated 
with differential use of mnemonic strategies (Lyon, 1977), but this again appears to be an 
example of a statistically reliable effect that is not practically significant. Normal adult 
memory span runs from five to nine items, depending on the material to be memorized 
(Miller, 1956). Matarazzo (1972) has observed that this is not a wide enough range to be of 
clinical significance.* On the other hand, Matarazzo also advises that memory spans below 
this range may be indicants of brain damage. Ellis (1978) has observed that mental 


* "This raises the interesting question ‘What 1s clinically significant?’ Language 1s a product of the interaction 
between social and biological evolution, and may very well have developed in such a way that ‘proper speaking’ 
means that the speaker produces language in such a way as not to overtax the information-processing capacities of 
all but a very few members of the population Put another way, human language must adjust to the lowest 
informaton-processing capacity that would be considered ‘normal’, not to the average. If mnemonists constituted 
95 per cent of the population, we might have developed a very different communication system 
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Figure 4. Functions relating RT to memory set size in the memory scanning paradigm for groups 
varying 1n intellectual ability. 


retardates show a deficit on practically any task that taps primary memory capacity, and 
argues strongly that this is not due to a failure of the retardates to use powerful mnemonic 
strategies. Huttenlocher & Burke (1976) have made the same argument with respect to the 
fairly large changes in memory span that occur as children mature. As in the case of 
long-term memory measures, the efficiency of short-term memory is at best a moderate 
predictor of intelligence test scores within the normal adult range, but if we move to the full 
spectrum of mental competence, marked differences in short-term memory efficiency are 
observed. 

Over the years there have been a number of studies that attempt to relate ‘ability to 
learn' to intelligence test scores. Indeed, some authors have even maintained that 
intelligence should be defined as the ability to learn. To the extent that there is truth in this 
proposition, performance in learning experiments should relate to tested intelligence. One 
of the most comprehensive attempts to show this relationship, almost as a by-product of an 
effort to understand the components of learning itself is an experiment by Underwood et al. 
(1978), in which some 200 university students participated in 33 (!) different learning 
experiments, and also made available their scores on the Scholastic Aptitude Test. 
Underwood et al. were primarily interested in the factorial composition of performance on 
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the learning tasks, and simply observed that the different subtests of SAT appeared to 
represent a cluster of abilities different from those required for learning under various 
conditions. I have reanalysed the Underwood et al. results, including in the analysis the 
aptitude test measures. The analysis recovered the original learning factors and, as 
Underwood et al. suggested, also identified a ‘test factor’ that was independent of the 
learning factors. Table 2 shows the loading of the SAT verbal aptitude test on all six 
factors. Clearly test performance in normal subjects is related to learning performance, but 
the relation is not a close one. On the other hand, though, learning is notoriously deficient 
in the mentally retarded. Clearly no one would want to argue that learning ability is not a 
component of intelligence. 


Table 2. Loading of verbal comprehension test on various memory factors 

















Factor 

1 2 3 4 5 

Paired Simult. Serial Verbal : Free 6 

assoc. learn. list discrim. recall SAT 
SAT-V loading 0:23 0-15 —0:28 0-02 0-20 0-51 





Note. Communality of SAT-V = 0-46. 


These results are typical of many other results relating information-processing to 
general measures of (verbal) cognitive competence. Given reasonable attention to statistical 
power considerations, reliable associations are easy to find. Practically significant 
associations, within the normal range of intellectual competence, are seldom found. Keele 
(1979) has summarized the situation nicely by referring to the ‘0-3 barrier’; no single 
information-processing task seems able to account for more than 10 per cent of the variance 
in a general intelligence test. Of course, one might hope that a set of, say, 10 such tasks 
would provide us with a complete account of intelligence. Unfortunately, this does not 
work either. Most measures of memory functioning are positively correlated with each 
other, so the multiple correlations between verbal aptitude tests and batteries of 
information-processing measures are seldom higher than 0:6 (see, for instance, Lunneborg, 
1977). On the other hand, as soon as we move to the study of differences between groups 
whose mental competence varies widely, we find that practically every information- 
processing measure will singly differentiate between groups. What we do not find is any 
appreciable number of ‘in between’ studies, in which the correlations are in the 0:5 to 0-7 
range. 

I do not believe that this problem is a statistical one, produced solely by a tendency to 
study populations who differ either very little or very much in their cognitive competence. 
Rather, I believe that we are seeing evidence of a qualitative difference. Changes in basic 
information-processing parameters probably do account for a great deal of the differences 
in individual cognitive power when we compare, say, mental retardates to high school 
students. When we examine the very real differences in cognitive power between dull and 
bright university students, or even dull and bright *normal people', we may find that these 
differences are produced by other factors. To consider what these other factors are, and 
how they fit into cognitive theory, a return to a more theoretical perspective is in order. 
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Cognition and information-processing 


Following the lead of Newell & Simon (1972), I believe that it is appropriate to think of 
human reasoning as being the product of a program’s being executed on a peculiar 
information-processing device, our brains. The ‘computer analogy’ is frequently 
misunderstood to mean that our brain must follow the style of processing of physical 
computers: binary operation, passive memory systems, and serial computation. This is an 
error. The analogy only maintains that it is useful to think about thought by applying the 
same concepts to human reasoning that we would apply to any physical information- 
processing system. 

Every problem-solving machine must possess some mechanistic capacities for storing, 
retrieving, and transforming information. This is the structural aspect of thought. Most of 
the information-processing paradigms of experimental psychology have been designed to 
deal with structural considerations. In order to solve a problem the mechanistic capacities 
must be applied in a particular and possibly highly flexible order. This is the program or 
‘process aspect of thought. Finally, virtually every activity that we would call intelligent 
presumes some coordination between the present situation and the problem-solver’s store 
of previously acquired information. This is the knowledge aspect of thought. You cannot 
say how or how successfully a particular information-processing system...be it man or 
computer. ..will attack a particular problem unless you understand its structure, process, 
and knowledge. 

To drive this point home, let us consider an analogy to basketball playing rather than 
computing. If you tried to predict a basketball player’s scoring potential from isolated 
physical characteristics, you would have only limited success. Extreme weakness or lack of 
stature would be associated with very poor performance, but once the person moved into 
the ‘above normal’ field, correlations with physiological measures break down. The 
reason is that there are two quite different ways of scoring points in basketball. Some 
players score by muscling their way underneath the basket, then jumping up and slamming 
the ball down into the goal. For players who use this strategy, height and weight are good 
predictors of success, while hand-eye coordination and depth perception are not. The other 
strategy for scoring is to step backwards, away from your opponent, and toss a high, 
arcing shot up into the goal, over the heads and hands of the opposition. Players who use 
this strategy need not be particularly large or strong, but must be quick and have excellent 
depth perception. 

With both the computer and athletic analogies in mind, let us look again at intelligence 
as defined by psychometric theory. Intelligence tests designed for use in clinical, 
educational, and industrial prediction are typically (intentionally and properly) designed to 
be work samples for the endeavours to be predicted. They owe their success to the fact that 
they test so many behaviours that they are almost bound to produce a good sample of a 
person's general cognitive capacities (Wechsler, 1975). Given the pragmatic behaviour- 
sampling approach taken in the development of such instruments as the Wechsler and 
Stanford-Binet tests, it is unreasonable to expect that any one information-processing 
procedure would provide 'the answer' to our questions about the nature of intelligence. 

An alternative way to look for a relation between experimental and psychometric 
theories of cognition is to examine the 1nformation-processing correlates of psychometric 
tests that have been developed as markers of an explicit theory of intelligence. Examples of 
such tests are those developed to identify the separate but correlated abilities to solve 
problems either by applying previously acquired knowledge or by inventing a solution 
method on the spot, 1.e. Cattell’s (1971) crystallized and fluid intelligence factors. The 
Lansman et al. study previously cited (in preparation) illustrates this approach. While these 
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studies should aid in advancing our understanding of the relationship between 
psychometric and information-processing theories, the results to date do not indicate that 
they will produce a major breach in the 0-3 barrier. They may push it back to 0-4, but the 
search for a ‘true’ single information-processing function underlying intelligence is likely to 
be as successful as the search for the Holy Grail. This does not mean that 
information-processing approaches have nothing to say about individual differences in 
cognition; it means that something more sophisticated than a search for correlations is 
required. The reason why is interesting. 

Information-processing paradigms such as the stimulus identification and memory- 
scanning paradigms considered above were intended, in so far as possible, to tap a single 
‘basic’ information-processing function. The more robust psychometric tests of intelligence, 
that predict performance in a variety of situations, seem to require the orchestration of 
several different functions. To the extent that this is true, some consideration must be given 
to the ability of an individual to select the functions to be used in attacking a particular 
problem, and to coordinate the execution of those that are selected. This point is illustrated 
by several recent analyses of the information-processing requirements of different 
psychometric tasks. Each of the studies brings us to a similar conclusion, but by a different 
route. 

Snow (in press) constructed an abstract ‘space’ of mental tests, by applying 
multidimensional scaling techniques to the matrix of correlations between individual tests. 
An abstract of his results is shown in Fig. 5. Instead of finding that tests of general 
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Figure 5. An abstract of Snow’s multidimensional scaling solution defining a space of mental tests. 


reasoning, including tests of verbal reasoning and comprehension, lay along a dimension of 
this space, Snow found that reasoning and comprehension tests were located in the centre of 
the space. Tests of more specific abilities, such as the ability to complete incomplete figures, 
memory span, or perceptual speed measures, were in the periphery. The peripheral tests, 
though, are those that most resemble the procedures used by experimental psychologists to 
test specific information-processing functions. They present people with very restricted 
problem-solving situations, in which there is only one reasonable way to attack the task. 


458 Earl Hunt 


Performance in such a situation will be more determined by mechanistic information- 
processing functions than by choice of a problem-solving strategy simply because of 

the limited range of strategies possible. By contrast, performance on tests in the centre of 
the space may be much more dependent on a person’s having available a store of strategies 
to deal with the varied problems presented by different 1tems within each test. To illustrate, 
Snow found that the Raven Matrix test (Raven, 1965) lay in the central cluster. Logical 
analysis of this test has shown that it is amenable to attack by at least two psychologically 
distinct strategies, one based on perceptual reasoning and one based on propositional 
reasoning (Hunt, 1974). Statistical analyses of very large samples of persons taking the 
Raven test have also shown that there are clusters of performance that are, presumably, 
associated with different strategies. The assumption that the test is some sort of yardstick 
for a univariate, normally distributed ability cannot explain the pattern of clusters obtained 
(Hunt, 19788). 

The conclusion that the tests in Snow’s central cluster are characterized by their having a 
number of different solutions, depending on the program the person chooses to use, is 
reinforced by studies of the individual items in tests that are considered ' good indicators of 
intelligence’. Carroll (1976) performed a ‘Gedanken’ experiment, somewhat similar to the 
analysis of the Raven Matrices, in which he analysed the information-processing 
requirements of various test items in the Educational Testing Service's reference battery 
(Harman et al., 1976). The more complex subtests appeared to Carroll to require more 
different information-processing steps. 

Still more direct evidence has been obtained by R. Sternberg's (1977, 1978) careful 
analysis of the time spent in each information-processing step during the solution of 
individual intelligence test items. Consider, for example, the frequently used verbal analogy 
test. A typical analogy item is 


DOG is to CAT as WOLF is to (HYENA, LION, SKUNK, FOX). 


Sternberg has shown that the solution of such problems can be broken down into several 
Steps; encoding the information associated with each term, comparing the first two terms 
(DOG, CAT) to each other, inducing the relationship from this comparison, and applying the 
relationship to map from the third term (WOLF) into one of the possible response terms 
(HYENA, LION, SKUNK, FOX). Each of these steps calls upon different mechanistic 
information-processing actions. Each step will introduce its own variance into performance 
on the problem as a whole. Sternberg has also shown that the separate steps can be 
combined in different orders, and that the importance of an isolated step to total 
problem-solving performance cannot be evaluated without knowing what the combination 
rule is. If this is true of individual test items, how can we expect to establish correlations 
between very specific aspects of information-processing and total test performance unless 
we can identify strategies and the people using them. 


Strategies as mediators of structure: An illustration 


The observation that strategies must be considered in evaluating individual differences in 
cognitive performance 1s hardly original. Newell & Simon, surely the leading proponents of 
the view that thinking can be modelled by computer simulations, have warned that ' A few, 
and only a few, gross characteristics of the human information-processing system are 
invariant over tasks and problem solvers’ (1972, p. 788). This is undoubtedly correct. 
Summarizing the relationship between cognitive performance on two different tasks by a 
linear equation may give us a picture of population performance that fails to capture the 
essence of individual problem-solving. But what is the alternative to the correlation 
coefficient? Presenting simulation programs for each person and each task is clearly an 
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inadequate summarization. Having provided an excellent argument for rejecting 
correlational studies of thinking, the computer simulation approach as yet has not 
developed an alternative method of summarizing observations. How are we to summarize if 
each person is unique? 

One approach that we can take is to identify groups of people who use similar strategies, 
and apply correlational analysis within each group. Problem-solving strategies can be 
grouped into large classes, based upon the problem representation that each strategy uses. 
Psychologists, computer scientists, and educators have long argued that the way in which a 
problem-solver initially represents the problem is one of, if not the, major determinants of 
performance (Bloom & Broder, 1950; Polya, 1954, 1957; Simon & Hayes, 1976). We can 
divide representations themselves into two broad classes: linguistic representations or 
spatial-imaginal representations. The sorts of skills that a problem-solver uses to solve a 
particular problem will depend very much upon which of these two classes of 
representations are chosen. Furthermore, the argument does not apply only to the very 
complex problems studied in mathematics or education; we have found that it applies to 
ostensibly very simple cognitive tasks. When allowance is made for the type of problem 
representation chosen, and the concomitant choice of strategy, we find that some puzzling 
observations about the relationship between information-processing and psychometric 
performance become quite regular. 

The task that we have chosen to study is the sentence verification paradigm (Clark & 
Chase, 1972), a miniature linguistic situation in which verbal statements must be 
coordinated with non-verbal stimuli. In a sentence verification paradigm the participant 
first sees a sentence describing a simple picture, and then sees the picture. The task is to 
determine whether or not the sentence accurately describes the picture. Some examples are 
shown in Fig. 6. A logical analysis of each sentence is also shown in the figure. This 











Sentence Picture 
Trial type Sentence Picture representation representation 
True affirmative STAR IS ABOVE PLUS * [AFF(STAR, TOP)] (STAR, TOP) 
(TA) PLUS IS BELOW STAR + 
False affirmative PLUS IS ABOVE STAR * [AFF(PLUS, TOP)] (STAR, TOP) 
(FA) STAR IS BELOW PLUS + 
True negative PLUS IS NOT ABOVE STAR z {NEG[AFF(PLUS, TOP)]} (STAR, TOP) 
STAR IS NOT BELOW PLUS + 
False negative STAR IS NOT ABOVE PLUS * {NEG[AFF(STAR, TOP)]} (STAR, TOP) 
(FN) PLUS IS NOT BELOW STAR + 











Figure 6. Sample sentence verification items. 


demonstrates that the sentences vary in the extent to which they contain embedded 
propositions. A number of experiments have shown that the time required to verify a 
sentence as a description of a picture depends upon the extent of the propositional 
embedding. (For a review of this literature, see Carpenter & Just, 1975). Furthermore, speed 
of sentence verification has been shown to correlate moderately well with measures of 
general verbal comprehension (Baddeley, 1968; Lansman, 1978). On its face, and from a 
theoretical analysis of the task as an exercise in psycholinguistics, the task appears to be a 
reliable, rapid way to measure one’s competence in dealing with linguistic materials. This is 
particularly interesting because the test itself is virtually knowledge free, while many 
conventional tests of language comprehension have been criticized for their dependence upon 
specific semantic knowledge. 
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One of the major strengths of the sentence verification task as a measure of language 
performance is its close tie to theories of psycholinguistic information-processing. As noted, 
a psycholinguistic approach assumes that sentence verification requires the resolution of 
various embeddings. The basic ideas of the psycholinguistic approach are that: 

(a) The picture is represented by the simplest possible propositional representation. Thus 
the picture (*) would be represented as STAR ABOVE PLUS. 

(b) The process of verification involves successive transformations of the sentence 
representation until it either matches the picture representation or no further 
transformations are possible. Thus to resolve STAR NOT BELOW PLUS the marked form BELOW 
must be converted to NOT ABOVE and the negations must be resolved. 

All psycholinguistic information-processing models assume that each transformation 
takes time. They differ only in the way they regard the transformations. Clark & Chase 
(and Trabasso et al., 1971, in a related paper) estimate parameters for resolving marking, 
negation, and the affirmative-negative decision separately, whereas Carpenter & Just (1975) 
regard each of these as the same process, requiring estimation of a single parameter. Both 
models can be shown to account for better than 90 per cent of the variance in the times 
required to verify different types of sentences. In general, negatively worded sentences 
require more time to verify, sentences with marked forms take longer to verify, and 
negative decisions are slower than affirmative decisions. There are also interactions between 
these effects, which are predicted by the psycholinguistic model. By any account, the fit of 
the data to the models 1s impressive. 

Most studies of sentence verification have used relatively few subjects, and hence have 
not studied individual differences. Lansman (1980) obtained data from some 70 subjects. 
Averaged over subjects, the data showed a close fit to the expectations of the 
psycholinguistic models, as is shown in Fig. 7. However, she realized that the individual 
differences data could be used to discriminate between the two main psycholinguistic 
models. The *one parameter' model requires that there be a very high correlation between 
estimates of individual times required to resolve different types of embedding, since each 
resolution is assumed to be accomplished by the same process. The results are shown in 
Table 3. The expected high correlations did not appear, so the single parameter model can 
clearly be rejected. But the multiple parameter model 1s also in trouble. The reason for this 
has to do with our estimate of falsification. Two estimates are possible, one for 
affirmatively worded and one for negatively worded sentences. The two estimates of the 
same parameter are not correlated. Clearly the models that do so well in handling response 
times averaged over individuals are doing very poorly when applied to individual 
differences data. 

These paradoxical observations have been resolved by a series of experiments conducted 
by Colin MacLeod, Nancy Mathews and myself (MacLeod et al., 1978; Mathews et al., in 
press). To foreshadow, we have shown that the type of information-processing underlying 
sentence verification depends upon how the subject approaches the task. Our procedure, 
which differs slightly from that used in some other studies, is shown in Fig. 8. The sentence 
is presented, and left on display until the subject indicates that it has been comprehended. 
The time required for this will be called comprehension time. The picture is then presented, 
and the subject decides whether or not the picture was correctly described by the sentence. 
The time required for this decision will be called verification time. It is important to 
remember that verification time is the dependent variable that has been used in other 
sentence verification tasks. ' 

In our first experiment (MacLeod et al., 1978), we applied Carpenter & Just’s 
one-parameter model to both group and individual data. Averaged over subjects, the 
differences in verification times for the various sentence—picture combinations agreed well 
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Figure 7. Comparison of observed group means and values predicted from the Clark & Chase model 
of the sentence verification task (R? = 0:972): ---, observed; ——, predicted. 


Table 3. Correlations between parameters, all subjects 





Below time 
Falsification I Falsification II (Reliability = 0 52) 





Negation time 
(TN + FN)—(TA+FA) 0:28** 0:32** 0-22* 
Reliability — 0-91 

Falstfication I 
(FA—TA) — 0-10 0-44** 
Reliability — 0-74 

Falsification II 
(TN—FN) — — 0 17 
Reliability = 0-77 











* P«005; ** P «001. 


with the predictions of the one-parameter model. On an individual basis, however, the fit 
ranged from very good to very poor. We identified three groups of subjects, those whose 
data conformed closely to the model, those whose data appeared to bear no resemblance 
whatsoever to any data predicted by a psycholinguistic model, and a group of ‘in between’ 
persons. The first two groups will be referred to as the ‘well-fit’ and ‘poorly-fit’ groups. As 
is the historic fate of compromisers, the third group will not be further discussed. 
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Figure 8. The sentence verification 
paradigm with sequential presentation 
of sentence and picture. 
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Figure 9. Mean verification RTs of the well-fit and poorly-fit 
groups as a function of the number of constituent comparisons 
hypothesized by Carpenter & Just's model; also included are the 
95 per cent confidence intervals, and the best fitting straight line 
for the well-fit group only: , well fit; ——, poorly fit. 





Figure 9 shows the relationship between the predictions of Carpenter & Just's model and 
the data from the well-fit and poorly-fit groups. The discrepancy 1s striking. But why? We 
hypothesized that the two groups were using qualitatively different strategies. The strategies 
we believed to be involved are depicted in Fig. 10. In the /inguistic strategy the subject 
reads the sentence, remembers it in some form tied to the propositional structure of the 
sentence, then observes the picture, derives a sentence (or propositional structure) from this 
observation, and compares the two representations. In the spatial-imaginal strategy the 
subject reads the sentence, forms a mental image of the picture that is expected, then 
observes the picture and compares the internal visual representations of the observed and 
expected display. 

Two independent analyses were conducted to test this hypothesis. The linguistic strategy 
places the burden of translation from one representation to another on the verification 
stage, while the spatial-imaginal strategy places the burden on the comprehension stage. 
Accordingly, users of the linguistic strategy should spend more time in verification and less 
in comprehension, while the reverse should be true of the users of the spatial-imaginal 
strategy. Table 4 shows the relevant data. This prediction was confirmed. The second 
analysis, which was especially relevant to individual differences, examined the relationship 
between verification time and psychometric scores of verbal and spatial aptitude within 
groups of strategy users. There should be an interaction between predictability and strategy 
use. Verbal comprehension scores should be closely related to verification for linguistic 
strategy users, while spatial aptitude scores should be closely related to verification for 
spatial-imaginal strategy users. The appropriate correlations are shown in Table 5, and are 
as predicted. 
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Figure 10. Flow chart representations of the strategies believed to be used by the well-fit and 
poorly-fit groups. 


While the MacLeod et al. study produced a consistent pattern of results, a post hoc 
analysis of data is always suspect. The Mathews et al. study extended our reasoning by 
reproducing the data for the two strategies experimentally. The experiment consisted of 
three sessions, on successive days. On the first day the MacLeod et al. sentence verification 
procedure was replicated. This will be called the ‘free’ condition. The criteria developed 
from the MacLeod er al. experiment were used to divide the new sample of subjects into 
groups, and the analysis from the first study was repeated. The same phenomena were 
observed, as is shown in Fig. 11. The second and third days were replications except that 
the subjects were instructed to use one strategy or the other. (As there was no evidence of 
an effect of order of instructions, this variable will be disregarded.) Figure 12 shows the 
results. It is clear that our university student subjects were able to perform either in accord 
with the spatial or linguistic strategies. Thus the MacLeod et al. results should not be 
interpreted as establishing a type of reasoning, in the sense that such typologies as 
introvert-extravert or field dependent-field independent have been proposed. Rather, our 
results show that above average young adults can shift from one strategy to another 
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Table 4. Mean overall comprehension RT and verification RT for well-fit and poorly-fit 
groups 











Group Comprehension Verification 
Well fit (n — 43) 1652 1210 
Poorly fit (n = 16) 2579 651 








Table 5. Correlations of psychometric scores with mean verification RT 








Nelson-Denny WPC WPC 
Group comprehension verbal spatial 
Well fit —0-47* —0.52* —032 
Poorly fit —0-03 —0 33 —0:68* 











* Significant beyond P « 0-01. 


relatively easily, and that the underlying abilities that they use to solve an ostensibly 
linguistic task depend upon strategy choice. 

Whether or not less talented subjects could display the same flexibility in strategy choice 
is an open question. We need further studies to determine the conditions under which 
particular types of individuals will use particular strategies. The general point remains 
valid. The relationship between task performance and information-processing capabilities 
depends upon the individual's choice of how the task is to be done. Most complex 
problems including those problems that are typical of general intelligence test items permit 
considerable flexibility in making this choice. 


The problem of general intelligence 


By stressing the importance of strategy choice in intellectual performance, we implicitly 
develop an argument for a view of intelligence as a combination of special abilities; i.e. the 
‘ability’ to make good strategy choices. The extreme statement of this view is that there is 
no such thing as general intellectual capacity. Cognitive behaviour is instead seen as a 
compendium of structural capacities and strategies to hold them together. This viewpoint is 
consistent with much of the thinking in both experimental and psychometric psychology. 
The quotation from Newell & Simon (see above) is a good summary of its logic. 
Psychometricians will recognize the specialized viewpoint as being a restatement of 
Guilford's (1967) view that there are a variety of highly specialized abilities, each defined by 
stating the type of stimulus material being processed, the type of operation required on it, 
and the type of answer required. Indeed, Guilford has used this cross-classification scheme 
to generate a table of over 100 hypothetical abilities! 

An opposing view, which dates back to Spearman (1927), and is represented today by 
Horn (1979) and Jensen (1979), is that there are one or two broadly relevant ‘general 
intelligence’ capabilities, which permeate virtually all intellectual endeavours. The principal 
evidence for the general intelligence viewpoint is the observation that superficially disparate 
intellectual tasks are almost always positively correlated. 

The argument between the generalist and the specialist view does, at times, take some of 
the aspects of an argument over whether a glass is half full or half empty. The generalist 
points to the undeniable fact that many cognitive tasks are positively correlated, with rs in 
the 0-3 to 0-4 range. The specialist observes that the r? values are only 0-1 to 0-2! Granted 
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Figure 11. Mean verification RTs of the well-fit and poorly-fit groups as a function of the number of 
constituent comparisons hypothesized by Carpenter & Just’s model. Results are for the first day of 
the study, during which subjects received no instructions concerning strategy. ——, well fit (n = 21), 
——-, poorly fit (n = 11). 
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that this is true, the phenomena of widespread positive correlations between different tests 
(technically, the phenomenon of positive manifold) is too robust a fact to be ignored. 
Explaining it within an information-processing concept requires that we locate some 
information-processing concept that applies to an equally wide range of behaviours and 
show that this concept is related to test performance. 

There ıs such a concept, but it does not fit easily into the information-processing view 
usually presented. This is the concept of attentional resources. Probably the most 
comprehensive recent statement of this concept has been given by Kahneman (1973), 
although a number of other names are also associated with the idea. Posner (1978) has 
cited references to the concept in the late 19th century. 

The basic assumption of ‘attention theory’, for want of a better name, is that every 
human information-processing task requires the allocation of some (rather poorly defined) 
‘attentional resources’ for its execution. If less than enough resources have been supplied to 
a particular mechanistic process, then that process may be able to function but it will do so 
at a reduced level of efficiency. Whether or not this will have a catastrophic effect upon 
thinking depends upon the extent to which the affected process is central in the 
problem-solving strategy being executed. The attentional resource concept is even broader 
than the concept of general intelligence, for attentional resource demands are assumed to 
be made by non-intellectual information-processing tasks, such as signal detection, as well 
as by such things as paragraph comprehension and arithmetic problem-solving. 

Marcy Lansman and I have been exploring the possibility that differential demands for 
attentional resources can be used to explain individual differences in a wide range of tasks, 
all of which involve information processing, but not all of which would conventionally be 
called ‘thinking’. In order to study attention resource demands we have used the ‘dual 
task’ methodology, in which a person is asked to do two information-processing tasks at 
once. We examine inter-task interference as an indication that the two tasks draw on a 
common mental resource. Such paradigms have been subjected to extensive theoretical 
analysis (Kerr, 1973; Norman & Bobrow, 1975; Posner, 1978). Customarily, one of the 
tasks is designated to be the primary task, and the other the secondary task. (For brevity, 
we shall refer to tasks A and B.) An assumption of the strict secondary task interpretation 
is that task B is done with whatever spare capacity remains after task A has been executed. 
This implies that task B should not interfere with task A. We, and others, have found 
that this assumption can seldom be justified, so we offer a slightly different analysis of the 
dual task paradigm that does not depend on the primary task-secondary task distinction. 

Tasks A and B must be chosen so that it is not reasonable to expect them to compete for 
the same information-processing structures (* structural interference"). For instance, one would 
certainly not use tasks that required incompatible responses, such as moving a lever in task 
A and pressing a button with the same hand in task B. Such gross examples are easy to 
deal with. In practice, though, the situation may be much more subtle, and whether or not 
structural interference has been avoided is often a matter of judgement, a point to which we 
shall return in a moment. First, though, we consider an extremely simple model of the 
distribution of attention. Figure 13 shows the logic of the simple view. Imagine that there 
are two ‘machines’, tasks A and B, that compete for attentional resources in the same 
sense that a washing machine and a light might compete for electrical power in the home. 
As one machine (task) begins to overload the circuit the power (attention) available for the 
other machine will be reduced. 

To extend this model to the study of individual differences, suppose that Fig. 13 was a 
diagram of a washing machine-light circuit, and that the washing machine was inefficient 
and thus exerted a heavy load on the system just before it broke down. In a very simple 
circuit (i.e. one without safety fuses) the first indication that you would have of a 
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Attentional 
capacity 


An overly simple model of thinking 
Figure 13. The battery model of attentional resources (after Kahneman, 1973). 


malfunction of the washing machine would be a dimming of the lights as the appliance 
began to make excessive demands on the circuit. 

Lansman and I have made use of a similar logic in our studies. We sought tasks A and B 
that had the following characteristics: 

(a) the tasks are sufficiently different so that structural interference is unlikely; 

(b) the difficulty and, in theory, the demand for attentional resources of task A can be 
varied in a continuous manner; 

(c) the level of performance of task B varies in response to the attentional resources 
supplied to it. 

One series of experiments (Lansman, 1978; Hunt et al., 1979) applied the paradigm to 
study attentional demands in easy and hard memory tasks. Task A was the continuous 
paired-associates task developed by Atkinson & Shiffrin (1968). In this task the subject 
must keep track of the continuously changing state of several variables. This is done by 
pairing numbers with letters, periodically requiring the subject to report the number 
currently paired with a letter, and then changing the letter-number pairing. The exact 
procedure is shown in Fig. 14. The task can be made arbitrarily difficult by varying the 
number of letter-number pairs that must be kept in mind. Task B was a simple probe 
reaction task that was inserted during the memory task. Figure 14 shows the procedure for 
a visual probe; auditory probes were also used. 

Our interest centres on probe performance under memory load conditions (keeping track 
of two variables) as a predictor of individual performance under hard memory load (seven 
variables). Recalling the washing machine analogy, probe reaction time under the easy 
memory condition is analogous to the light's intensity when the washing machine has a 
small load and should thus predict, across individuals, that those persons who would have 
the most difficulty in the paired-associates task were the memory load (number of 
associates to be retained) to be increased. The relevant correlations are shown in Table 6. 
There was a reliable, moderately high correlation between probe reaction time in the easy 
memory condition and memory performance in the hard memory condition. 

One can object that while this does show that probe reaction and memory do draw upon 
a common attentional resource, after all, short-term memory is not the same as thinking. 
We have applied the same design to an analysis of two tasks that differed even more 
radically in their surface characteristics (Hunt et a/., 1979). In this experiment task A was a 
subset of 18 of the 36 Raven Progressive Matrices problems (Raven, 1965). Raven 
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Event Display Duration 
Sequential presentation of initial pairs. A=7 38 
B=3 3s 

Query. The correct answer is 3. B=? Subject-paced 

Letter just queried is paired with a new number. B=4 38 

(Visual probe: On 3/4 of the trials in the (tee) (If subject fails 
probe condition, asterisks appear 500, 1000, B=4 to respond to probe 
or 1500 ms after the presentation of the new within 1-5 s, the 
pair. The subject presses any key as quickly probe disappears.) 
as possible.) 

Query The correct answer is 7 A=? Subject-paced 

Letter just queried is paired with a new number A=5 3s 








Figure 14 The dual task paradigm used by Lansman (1978), which involved recalling a series of 
letter-digit pairs and responding to a simple visual stimulus. 


Table 6. Correlations between probe RT and recall scores in various conditions, after 
Lansman (1978) 


Easy recall Hard recall 
Control condition — 0:09 —0-05 
Easy recall condition —0:27* —0-40** 
Hard recall condition 001 007 


* P<0-01; ** P «0-05. 


problems require that the subject detect a relationship between the elements of a complex 
visual pattern, and then apply that relationship to complete a missing part of the pattern. 
The problems vary widely in difficulty. The Raven Matrix problems are particularly 
interesting as a sample task A because this test is frequently cited as one of the best 
measures of the general intelligence factor (Jensen, 1979).* 

Task B was a psychomotor task designed so that it would not normally be considered a 
test of intelligence. The subject has to hold a lever between two posts, using the thumb and 
index finger of the left hand. By itself, this is quite easy to do. The task becomes difficult 
when the subject is distracted, in this case by attempting to solve Raven Matrix problems 
that were projected onto a screen immediately in front of the subject. Procedurally, the 
subject first practised the psychomotor task alone, then solved 18 Raven problems alone, 
and then solved 18 Raven problems while doing the psychomotor task concurrently. The 
Raven problems were presented in ascending order of difficulty, as ad by the extensive 
norms available for the test (Forbes, 1964). 

If both the Raven Matrices and the psychomotor task are drawing on the same 
attentional resources, then performance on the psychomotor task should deteriorate as the 
Raven problems become harder, as, indeed, it does. It is difficult to interpret this, however, 
as we do not have a clear model for attention allocations as the subject begins to ‘break 


* Referring back to Fig. 5, we see that the Raven test is located near tests that Horn and Cattell categorize as 
fluid intelligence (Gf) tests Hunt (1974) has shown that the test can be attached using a number of different 
strategies. Interestingly, Spearman (1927) agreed with the conclusion that the test measures g. 
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down’ by making errors on more difficult problems. A more sensitive test is to observe 
psychomotor performance before the subject makes an error on the Raven items. As a 
person approaches the first Raven problem that represents, for that individual, a non-trivial 
problem, the person’s psychomotor performance should deteriorate. Just where this 
happens in the sequence of Raven items, however, will vary from individual to individual. 

There are two ways that this prediction can be tested. By the same logic that applied in 
the memory experiment, there should be a correlation between individual psychomotor 
performance on the first five problems (on which virtually no errors are made) and the 
point at which a person makes his or her first error. The correlation was —0-30, which was 
statistically significant at the 0-02 level. Note that this cannot be explained by assuming 
differential concentration on one task or the other, because people who are doing well on 
the psychomotor task also do well on the Raven problems. Also, the correlation was 
calculated after partialling performance on the psychomotor task alone, and hence cannot 
be explained by assuming that people who do well on the psychomotor task also do well 
on the intelligence test. 

The effect can be shown somewhat more graphically by plotting psychomotor 
performance on the three problems just prior to problem N, as a function of N. Figure 15 
shows this for two groups of subjects, those who make their first error on problem N and 
those who make their first error on some problem beyond problem N. Clearly the subjects 
who are about to make an error show worse performance on the psychomotor task while 
solving problems just prior to their first error. 

The interpretation we would like to make of these results is that deterioration in the 





1:0 
P. 
f 
0-5 : ws. 
: ` +e Error GP 
r Correct GP 
8 
$ 0-0 
a 
—0-5 


6 10 13 
Problem 
Deviations before error 


Figure 15. Deviation rate on the lever during the three Raven problems preceding the problem 
plotted on the abscissa. .. .., the performance of those subjects who made their first error on that 
problem, , the performance of those subjects who made their first error on a later problem in the 
sequence. 
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secondary psychomotor task is due to diversion of attentional resources from it to the 
primary memory or reasoning task. When we observe two people with identical primary 
task performance but one shows deterioration on the secondary psychomotor task and the 
other does not, we interpret this as evidence that the person with the lower secondary task 
performance requires more attentional resources than does the other person to reach the 
same level of performance on the primary task. Hence, we assume that the first person is a 
less efficient performer of the primary task. This conclusion would not be warranted solely 
on the basis of secondary task deterioration. The two individuals might be identical in their 
efficiency of use of attentional resources in performing the primary task, but might vary in 
the total attentional resources available. If this were the case, though, we would expect 
there to be a positive correlation between performance in the two tasks done alone, under 
the assumption that in such a situation performance in each task would (partly) reflect the 
total level of attention resources available. In our own studies using secondary tasks in 
conjunction with primary memory and reasoning tasks the correlation between the 
cognitive and psychomotor tasks, done alone, was negligible and not statistically reliable. 
This strengthens our belief that secondary task deterioration was due to the efficiency of a 
person's primary task execution. 

In isolation the secondary task results support the argument that there is a pervasive 
‘mental energy’ that underlies a wide variety of cognitive tasks. This notion can be traced 
back at least to Spearman (1927), who believed that some sort of mental power concept 
was required to account for the g factor in many intelligence tests. More recently, Jensen 
(1979) has restated this argument, using as evidence the observation that in extreme group 
designs those groups that score high on intelligence tests have faster choice reaction times, 
after allowance has been made for motor movement. Jensen regards this as a measure of 
mental quickness, which would presumably be similar to the ability to handle a secondary 
task signal quickly and without interrupting the primary task. (An element of motor choice 
was involved in the psychomotor task used in the Raven Matrix experiment just cited.) In 
fact, in two separate studies in our laboratory we have found correlations of about 0-3 
between speed of choice reaction time, again allowing for motor movement, and scores on 
the Raven Matrices test. There seems to be enough evidence to conclude that the ability to 
concentrate attention and to make simple decisions rapidly is a component of much more 
complex mental behaviour. However, it is unlikely that the extremely simple model shown 
in Fig. 13 is any more than a rough heuristic for picturing the role of attention allocation 
and concentration during thinking. Let us consider briefly why the model needs 
amplification. 

The greatest simplification in this model, and, indeed, in Kahneman's (1973) book from 
which it is drawn, is that attention is treated as a single resource. If this were true, then 
every ‘attention demanding’ task would compete with every other task. They do not. The 
degree of inter-task interference depends upon the specific requirements of each task. In 
general, tasks involving memory appear to affect both motor-tracking and reaction-time 
tasks, but the latter two do not always interfere with each other (Wickens, 1978). Also, 
the model demands that as the difficulty of the primary task 1s increased, the performance 
level of the secondary task should always drop. This is not always the case. For example, 
McLeod (1977) had people do tracking and mental arithmetic tasks concurrently. 
Performance on the tracking task did not decrease as the arithmetic task's complexity was 
increased. Án even more puzzling pattern has appeared in our own data. We have 
sometimes observed correlations between secondary task performance with an easy primary 
task and individual performance on a hard primary task, even though secondary task 
performance does not decrease with an increase in the difficulty of the primary task (Hunt 
et al., 1979). These results are clearly not consistent with the simple view that mental and 
sensori-motor tasks compete for the same attentional resources. Elsewhere (Hunt, 1980) I 
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have suggested that the conflict is more likely to arise from two sources, competing 
demands by each task for specific mental structures and competition for the attention of an 
internal coordinating structure that must determine how to reconcile the first sort of 
competition. Lest this should seem complex, think of a simple analogy. People waiting in 
line at an airline ticket counter may not be in competition for seats on the same flight, but 
they are in competition for the services of the computer that assigns the seats. 

A somewhat different complication in interpreting experiments on attention and 
reasoning has to do with the assignment of priorities that the subject makes, both within 
and across tasks. Kerr (1973) has pointed out that the strict logic of the secondary task 
paradigm requires that one first completely processes the primary task, and then does the 
secondary task with ‘left over’ reseurces. It is doubtful that people have such fine control 
over attention allocation. We have found rather few situations in which the secondary task 
does not interfere with the primary task. A related problem arises within simple choice 
reaction-time tasks, where the subject 1s asked to respond as fast as possible, but also to 
minimize errors. If we observe two individuals, one of whom is faster and less accurate than 
the other, we have no way of knowing whether or not the differences in speed of 
responding are due to differences in speed of decision-making or differences in willingness 
to accept errors. This criticism has been made of Jensen’s studies of choice reaction-time 
and intelligence,* where error rates were not reported. It is unlikely that this particular 
artifact affected the results, as in our own work we typically observe a positive correlation 
between speed and accuracy and psychometric test scores calculated across individuals. 
Over trials, within an individual, however, speed and accuracy are negatively correlated. To 
my knowledge, there are no adequate investigations of the interplay between these 
phenomena. 

These cautionary remarks are not meant as an argument that studies of the relationship 
between attention allocation and complex reasoning are uninterpretable. The simple model 
depicted in Fig. 13 does quite well as a summary of what we know, now. Because 
experiments outside the individual differences have shown that attentional phenomena are 
complex, it is almost certain that the simple model will have to be modified in great detail 
as we learn more about individual differences in attention allocation. 


Concluding comments 


People differ widely in how, and how well, they think. One of the biggest sources of 
individual variance in thought is simply knowledge; different people know different things. 
Psychological research on intelligence has tended to ignore this, regarding it as more 
properly part of the realm of education or sociology. This is a limited view. The role of 
knowledge must be included in any comprehensive account of individual cognition. On the 
other hand, there are situations in which wide ranges of cognitive ability are displayed 
when it seems unlikely that knowledge is a determinant of differential performance. The 
experiments reported here are examples. 

Three sources of individual differences in information processing have been proposed: 
basic functions, choice of strategy, and attentional resource allocation. These factors should 
affect cognitive competence in different ways. Structural resources set limits on the 
effectiveness of specific information-processing functions. Such processes appear to be 
important when we contrast the cognitive capacities of quite different individuals, such as 
the contrast between normal and mentally retarded persons. As we learn more about 
subpopulations within such extreme groups, we may very well find that there are specific 
structural changes that apply to each normal *unusual group' contrast. For example, there 
* Jensen's presentation of his results at the NATO York Conference on Intelligence and Learning (1979), and 


again at the Mathematical Psychology Society's 1979 meetings at Brown University elicited a good deal of 
criticism Much of it was based on a concern that he had not properly allowed for the speed-accuracy trade-off. 
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is already evidence that specific types of mental retardation will lead to specific 
information-processing deficits (Money, 1964; Warren, 1978). 

Attentional allocation and strategy choice differences exert powerful but more transient 
effects on cognition. Not only do we think differently between ourselves, each of us varies 
in our own thought processes from time to time. Structural influences on thought will be 
mediated by strategy choice, so the relationship between complex thinking and measures of 
particular 1nformation-processing functions may often depend on how we are thinking at 
the time. While there is undoubtedly some truth in the notion of general cognitive styles, it 
would be a mistake to think that a given individual has a fixed style of thought. More 
studies are needed of the interaction between personal and situational characteristics and 
an individual's choice of problem representation and problem-solving strategy. Our results 
indicate that these variables can operate in what would appear to be, superficially, very 
simple problem-solving situations. More complex situations than sentence verification 
undoubtedly offer an opportunity for a much greater choice of strategy. 

It has been argued that the phenomenon of positive manifold, the tendency for 
intellectual tasks to be positively correlated (in spite of the effect of strategy choice just 
described), can be derived from the concept of attentional resources applied to complex 
problem-solving situations. For the same reason, we expect correlations between 
intellectual performance and perceptual-psychomotor performance under stressful 
conditions. We also expect to find mutual interference between intellectual tasks and 
demanding psychomotor activity. (Indeed, such interference was found in the dual task 
studies described above.) Airline pilots should not compose poetry during landings. 

Psychologists and sociologists have frequently discussed the causal correlates of cognition. 
Studies have been performed relating cognitive performance to variables such as education, 
socio-economic status, genetic constitution, and nutrition. The information-processing view 
of cognition suggests that some thought be given to how these variables are supposed to 
mediate our ability to think. Variables that represent relatively permanent characteristics of 
an individual, such as sex, genetic composition, and chronic injury, can presumably affect 
structure. Attentional resource changes may also be subject to such influences, but they will 
also reflect transient changes in an individual's physical state, responding to such things as 
the acute effects of drugs or illness, fatigue, and diurnal variation. Strategy choices are 
subject to a still wider range of influences. The problem-solving strategies a person could 
use will be determined by attentional and structural resources. The strategies that he or she 
will actually use will, within limits, be determined by education in its broadest sense. How 
has the person learned to solve problems? Who can learn to apply what strategies? I 
believe that more will be learned about the nature of cognition and its antecedents if we 
study the role of such causal agents directly upon measures of information-processing 
structure, attention, and strategy choice than will be learned from studies in which the 
dependent variables are extremely complex ‘intelligence tests’. 
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Modality differences in relation to grouping in immediate recall 
D. E. Broadbent, P. J. Cooper, C. R. Frankish and M. H. P. Broadbent 





An earlier study of auditory grouped presentation of digit strings had shown recall to be impaired by 
recalling the last group first. From this, ıt has been concluded that the item codes formed at Stage 1 
of processing, and before conversion to a group code at Stage 2, were not yet in an output buffer. 
The present paper employs visual presentation for two similar studies, with results opposite to the 
earlier one; in a third experiment, ungrouped auditory stimulation gave results which were a 
compromise between those for vision and those for grouped auditory stimulation. These results 
suggest a use of echoic storage, over long intervals; in the visual case this is not so and an output 
buffer is employed. 





For many techniques of studying memory, differences between visual and auditory 
presentation are found only for the most recently presented items (see, for example, 
Murdock, 1967; Conrad & Hull, 1968; Crowder & Morton, 1969; Murdock & Walker, 
1969). Admittedly, there is evidence that subjects can report over quite long retention 
intervals whether an event was detected by the eye or ear (Kirsner, 1974). This however 
does not require any difference in principle between visual and auditory presentation, 
merely that the sense organ has been encoded along with the information delivered to it; 
no differences in the principles of the memory mechanism need be supposed. The modality 
effect for recently presented information does show such difference in principle, and has 
been used (Broadbent, 1971) to argue that recency effects involve sensory buffer storage 
rather than abstract post-categorical storage. Once one departs from the most recent events 
in memory, however, it is tempting to ignore the sense organ of arrival, and to analyse 
memory mechanisms without considering it. 

At least, Broadbent (1975) appears to have fallen into thjs trap. In attempting to draw 
conclusions about the processing of events after they have been encoded, he used 
experimental data based solely on acoustic presentation. The purpose of the present paper 
is to show that the results are completely different for visual presentation; and that this 1s 
so for material quite early in the sequence. The results therefore alter Broadbent's original 
analysis, and also act as a cautionary example, that modality of presentation cannot be 
ignored even for fairly remote events. 

The distinctive feature of the experiments in Broadbent (1975) was that the stimuli were 
grouped rather than arriving in a regular stream. Under these conditions it has been 
pointed out by Frankish (1976) that modality differences can appear, not merely for the 
last group of items presented, but for earlier ones as well. Broadbent had however argued 
as follows. 

The effects of grouping in presentation are well known (Wickelgren, 1964; Ryan, 1969). 
An extremely plausible theory of the benefits of grouping is that each group of items is 
used to generate a fresh unique code, specific to that group rather than to each of the items 
within it. This view is supported by a demonstration of Bower's (1972), using meaningful 
sequences of letters embedded within a long string. Such sequences are beneficial only when 
the imposed group corresponds to the segmentation appropriate to the meaning. Another 
line of support is that repeated presentations of a list at intervals in an experiment do not 
show the usual Hebb effect, of improved recall on the later presentations, if the grouping is 
changed (Winzenz, 1972). Frankish (1976) has added further evidence by demonstrating 
that recall of items within a group is more contingent upon the recall of other items in that 
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same group than it is upon items outside the group. From this theory, Broadbent argued 
that each item within a group would need to be encoded temporarily on its arrival; and the 
fresh group code formed when all the items of the group had been received. The stage of 
the individual item codes he termed Stage 1, and that of the later group he called Stage 2. 
He then attempted to find characteristics of Stage 1 and Stage 2. 

One possibility was that Stage 1 represented a sensory store or icon, and that 
information from Stage 2 proceeded onwards to some further stage for output as response. 
In that case, the beginning of the list would have proceeded further through the system by 
the time of arrival of the last items, and reproduction in the original order of groups ought 
to be better than reproduction in reversed order. An alternative idea, however, was that 
Stage 1 was not merely used for input, but was also an output buffer, in the fashion 
familiar from ordinary laboratory computers. On this view, the information which had 
gone to Stage 2 would need to come back to Stage 1 before output. At the end of 
presentation, the last group would still be at Stage 1, while earlier groups would be in 
Stage 2; on this view, therefore, it would be easiest to respond with the last group first and 
only then to bring back the earlier groups to Stage 1 and reproduce them in their turn. 

Thus the prediction was that recalling the last group of items first would be 
advantageous if Stage 1 were an output buffer, but not if it were an icon or other early 
stage of processing. Broadbent (1975) carried out the appropriate experiment with auditory 
input, and found a marked impairment from recalling the last group first. Broadbent 
therefore drew the conclusion that Stage 1 was not an output buffer. 

The whole argument assumes, however, that modality effects are unimportant for early 
items in the list. Frankish's demonstration that this is not so, when grouped items are used, 
makes it essential to repeat the study using visual presentation. As we shall see, the results 
are in fact completely different in that case. 


Experiment 1 

This experiment repeated that of Broadbent (1975) but using visual presentation. That is, 
lists of nine items were presented, grouped by threes, and the task was in one condition to 
recall in the order of arrival; in another condition, the last group was to be recalled first. 
The comparison of interest lies in the relative ease of the two recall orders. When auditory 
presentation was used, reverse recall was harder; would this still be true with visual 
presentation? 


Materials 


Forty sequences each of nine digits were constructed, each grouped into threes. Repeated digits were 
allowed in a list, but were never adjacent: The lists were in fact identical with those used in the 
original experiment. In this case, however, each list was recorded on videotape, by computer, 
generating the list one digit at a time at the required speed, and making a recording from the 
computer CRT with a Sony Video Rover. For fast rates of presentation, each digit in turn appeared 
in the same spatial location for a half second, and was followed by the next digit. After each group 
of three digits, there was a further half second interval before the next digit. For slow presentation, 
cach digit appeared for a second, with a further second following every third digit. 

Each list was preceded by two warning signals: first a voice saying ‘ready’; second, an asterisk 
appearing in the location where digits would subsequently appear. 


Subjects and design 


Forty adult women from the Oxford Subject Panel were tested. All conditions were given to each 
person; each block of 10 lists was under identical conditions. Half the subjects were instructed to 
recall in the order of presentation for the first two blocks; for the second two blocks they were to 
recall the last group of digits first, followed by the first and second groups in order of presentation. 
The other half of the subjects employed reverse recall for the first two blocks, and direct recall for the 
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last two blocks. Within each order of recall conditions, half the subjects received fast presentation for 
their first block and slow for the second, and half the subjects did the reverse. 


Procedure 

Instructions were read aloud before the whole experiment, and those concerning presentation rate 
and recall condition were given in altered form before each block. Subjects were tested in small 
groups, and response was written using sheets with groups of three spaces for each recall. An interval 


of 10 s for response was allowed between the disappearance of the last digit in each list, and the 
auditory warning for the next trial. 


Results 


Items were scored as correct only if the correct digit was reported in its correct serial 
position. With lists of nine digits, recall of items regardless of position is somewhat 
meaningless as guessing gives very high performance by that measure. Figure 1 shows 
performance plotted against order of recall. It is clear that this, and not presentation order, 
is the major determinant of recall; the level of performance for the second group recalled 
was about the same whether it was the first or second presented; and similarly for the third 
group recalled. Position of a group in the recall order does give a significant effect, 

F = 7481, d.f. = 2, 78, MSe = 66:188, P < 0-001; the later groups being less well recalled. 
The effect of direct or reversed recall is also easily significant, F — 8:85, d.f. — 1, 39, 

MSe = 18-726, P < 0-01; it is due primarily to the first group recalled, and the interaction 
of the two effects is significant F = 3-77, d.f. = 2, 78, MSe = 41-201, P « 0-05. The effect 1s 
of course in the opposite direction from that of Broadbent (1975), error being less when the 
last group presented is the first to be reproduced. 

There is also a main effect of presentation rate, the slow rate being better, F = 13-18, 
d.f. = 1, 39, MSe = 20-709, P « 0-001, and an interaction of speed with position of group 
in the recall sequence. F = 7-39, d.f. = 2, 78, MSe = 15-407, P < 0-01. Inspection of Fig. 1 
shows that the beneficial effect of slow presentation is due primarily to the last group 
recalled, and is less noticeable earlier in the sequence. 
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Figure 1. Error proportions for Expt 1, that is, visual presentation and written response: O— e ely 
forward; A---A, slow forward; @—@, fast backward; A---A, slow backward. 
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Figure 2. Error proportions plotted from the results of Broadbent (1975), that ıs, for auditory 
grouped presentation and written response: O—O, fast forward; A---A, slow forward; Q— 9, fast 
backward; A---A, slow backward. 


These results concerning speed of presentation are again different from those found in 
the original experiment, where speed interacted with the recall instructions, and if anything 
slow speeds were worse than fast. Figure 2 shows for comparison the data of Broadbent 
(1975), replotted in order of recall to be comparable with Fig. 1. Had the data of that 
experiment taken the form of the current one, Broadbent would clearly have come to 
opposite conclusions, and have decided that Stage 1 was an output buffer. That is, the last 
group to arrive receives a benefit from an immediate response, since it is still in Stage 1; 
earlier groups had been passed to Stage 2, and any difficulty in recalling them is due 
primarily to the process of retrieval. 

A further detailed analysis was performed on the differences between Expt 1 and the 
data of Broadbent (1975). This is reported later in the paper; it concerns the exact 
explanation for the effects of sensory modality, whereas at present we are merely 
attempting to show that such effects exist. 

For the moment, the key result for the reader to bear in mind is that backward recall 
was better than forward in this visual experiment. 


Experiment 2 

The first experiment shows the importance of modality; but the response was written as it 
had been in the study by Broadbent (1975). Written responses involve vision, and it could 
therefore be argued that the differences between experiments were due to the use of a 
different modality for input and output in one case, but the same modality in the other. On 
this argument, the differences would not characterize vision and audition as such. As a 
check therefore the experiment was partially repeated using visual input, but with 
individual testing and a spoken response. 


Materials 
Exactly the same lists and presentation methods were used as for Expt 1. 
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Subjects and design 


Twelve adult women from the Oxford Subject Panel were tested; as the results were to be merely 
confirmatory of Expt 1, order of speeds within each condition of recall instruction was confounded 
with the sequence of recall instructions. Examination of the data of Expt 1 showed that this 
confounding was unlikely to produce harmful results. Thus six subjects were instructed to recall in 
the order of presentation for their first two blocks, and in the reversed order for their second two 
blocks. These subjects always encountered fast presentation before slow for each type of recall order. 
The other six subjects encountered the reverse recall order first, and also met slow presentation 
conditions before fast ones. 
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Figure 3. Experiment 2, that is, visual presentation and spoken response: O—O, fast forward; 
A---A, slow forward; @—@, fast backward; A---.A, slow backward. 


Results 


Figure 3 shows the results, which are closely similar to those of Expt 1. Again, the position 
of a group in the order of recall has a major effect, the later groups recalled being less 
accurate F = 25-30, d.f. = 2, 22, MSc = 66-816, P « 0-001. Again, as before, backward 
recall is superior to forward recall, F = 5-69, d.f. = 1, 11, MSe = 13-977, P < 0-05. Thus 
the discrepancy between Expt 1 and the results of Broadbent (1975) is clearly due to the 
modality of presentation, and not the similarity of the modality of presentation and 
response. 

There are however some differences from Expt 1. First, neither of the two interactions 
with recall position is significant; for that with forward/backward recall, F = 1-27, d.f. = 2, 
22, MSe = 27-74, P < 0-20. For the interaction with presentation speed, F = 1-09. d.f. = 2, 
22, MSe = 9-51, P < 0-20. (It can be noted from the size of the relative error terms that the 
subjects differ more in the impact of order of report upon the effect of position of a group 
in recall order, than they do in the impact of speed of presentation. Possibly the former 
condition may be met by different strategies while the latter is less likely to be so.) The 
main effect of speed of presentation is similarly insignificant, F = 0-35, d.f. = 1, 11, 

MSe = 33:689, P < 0:20. There is one significant interaction involving speed of 
presentation; that with direct/backward recall, for which F = 7-72, d.f. = 1, 11, 
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Figure 4. Experiment 3, that is, auditory presentation ungrouped and with a suffix: O—O, fast 
forward; A---A, slow forward; @—@, fast backward; A---A, slow backward. 


MSe = 18:385, P « 0-05. The direction of the interaction is that a slow rate of presentation 
is relatively more helpful to reversed recall, and the fast rate to direct recall. 

These differences from Expt 1 do suggest that response mechanism plays some role, 
though not in the major findings which affect Broadbent's argument about Stage 1 and 
Stage 2. From this experiment again, he would have concluded that Stage 1 was an output 
buffer. Once again, backward recall was better than forward recall in a visual experiment. 


Experiment 3 


It was stated baldly in the introduction that modality effects were so important because of 
the grouping of presentation. Direct evidence of this is however desirable. Consequently, 
the last experiment used acoustic presentation but without any pause following each group 
of items. Thus the use of a sensory store would be harder, because of the absence of any 
physical marker to allow retrieval to occur of items earlier than the last event (Frankish, 
1976). On the view that the results of Broadbent (1975) were due to use of a specialized 
acoustic store, this experiment ought to give results more similar to Expt 1 of the present 


paper. 


Materials 


The usual lists of items were used, but on this occasion were recorded on audiotape. In addition, 
each item started half a second after the preceding item in the fast condition, and one second in the 
slow condition; with no extra interval at the end of a group. At the end of the list, a spoken suffix 
occurred at the same interval as that between digits; the suffix was always an apparent instruction for 
recall, being the word ‘direct’ for two blocks of lists, and the word ‘backward’ for the other two. 
Each list was also preceded by a warning cue, namely a voice saying ‘ready’; this preceded the first 
digit by the same interval as that within the particular list, so that the lists had a prefix and suffix as 
well as being ungrouped. 


Subjects and design 


Twenty-four adult women from the Oxford Subject Panel were tested; the design was exactly as for 
Expt 1, except that only six subjects received each order of condition rather than 10. It was not 
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thought safe to assume that confounding would be harmless in this case. It should be especially noted 


that answer sheets with grouped spaces were still used; hence the slight evidence for some grouping in 
the results. 


Results 


The data are shown in Fig 4. Visually, they resemble Expts 1 and 2 rather than the data 
for grouped auditory presentation shown in Fig. 2. As in Expts 1 and 2, there is a main 
effect of position of a group in recall order, F = 166-95, d.f. = 2, 46, MSe = 23-928, 

P « 0:001. Similarly, there is an interaction of this factor with direct/backward recall, as in 
Expt 1; F = 25:87, d.f. = 2, 46, MSe = 16:547, P < 0-001. However, this time the 
interaction is due not only to the first group recalled, but also the last: and 
correspondingly, the main effect of direct/backward recall is insignificant. Numerically it is 
even in the opposite direction to Expt 1, and F = 3-03, d.f. = 1, 23, MSe = 28-21, 010 > 
P > 0-05. This is in the same direction as the result of Broadbent (1975), even though it 
would not by itself have been statistically acceptable. 

Like Expt 1, however, this study shows an interaction between speed of presentation and 
position of a group in recall, F = 3-20, d.f. = 2, 46, MSe = 12-269, P < 0-05. As in Expt 1, 
slow presentation is particularly helpful at the end of the recall. 

Had Broadbent (1975) used this technique, therefore, the results would, like the original 
auditory experiment, have pointed against Stage 1 being an output buffer. They are a 
compromise between the visual results and those for grouped auditory presentation. It 
seems clear that the inference one draws from experiments of this type depends upon the 
sensory modality used in presentation. With audition, backward recall is not better than 
forward recall. 


Further analysis and discussion 


Thus far, we have shown only that visual presentation gives different results from auditory. 
We now have to consider why this should be. Broadly, it seems clear that auditory 
information has available to it some extra form of sensory or precategorical storage, which 
visual information does not. Thus a series of sounds is converted into a series of item 
codes in a sensory store, and when all these item codes are present, a fresh group code is 
formed; whereas in the case of vision no sensory store is available, and therefore each event 
forms a code in an output sequence. The group code is formed from the separate output 
codes. In general, one can see that this will make the visual presentation benefit more 
markedly from recall of the last items first. In the terms of Broadbent (1975), Stage 1 is an 
output buffer in the visual case, but a sensory buffer in the auditory case. 

As noted earlier, a temporary or short-lasting advantage for auditory material is widely 
accepted. To allow grouping, a sensory register would need to last only 2-3 s even at the 
slow rate of presentation. Such a duration would be within the limits found by many 
investigators (see, for example, Darwin et al., 1972). The use of a temporary buffer in this 
way could however change the efficiency of recall even after quite a long period, because it 
would change the longer-lasting group code. Indeed such a view is implied in the major 
body of work on precategoric acoustic storage stemming from Crowder & Morton (1969). 
In that paradigm the last items of a list gain some benefit from acoustic presentation even 
though recall of the rest of the list takes place before recall of those items; as if sounds 
have an advantage in creating a postcategoric'code which can survive response 
interference. With grouped material the same PAS effect applies in every group individually 
and not merely on the last items of the series; this was first pointed out by Frankish (1976) 
and appears in the present results also. For example, in the auditory data of Broadbent 
(1975) the sixth item presented was better recalled than the fourth, whether recall was 
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direct or reversed (P « 0-02 by sign test in either case). That is, a recency effect appeared 
within a group and not merely on the last item of the list. In the corresponding conditions 
of Expt 1, but with visual presentation, there was no indication whatever of recency within 
the group, and the differences between the experiments in number of subjects showing 
recency was itself significant by y*, P < 0-02 for either recall order. 

Most of our results therefore can be handled purely by the familiar concept of a 
temporary sensory store peculiar to audition. On that view the novel feature of the results 
is to demonstrate that the use of this store during presentation may have unexpected 
longer-term effects. However, there are some details which are still puzzling. If acoustic 
stimulation produces a code more resistant to response interference, why does hearing, 
unlike vision, show a significant impairment from recall in the backward order? It is as if 
memory for spoken words was subject to more impairment than memory for visual 
information, not less. To check on this point, it is worth examining the data again 
excluding the very last group presented. This is because, in acoustic presentation, this last 
group will benefit from any more durable code if it is recalled last but will not benefit if it 
is recalled first. This would minimize the advantage of reversed recall; whereas in vision 
reversed recall will give a large advantage for this last group. Thus the theory of an 
extremely temporary auditory store could explain the results we have obtained, but would 
attribute them solely to performance on the last group presented. On the first two groups 
presented, reversed recall will always mean greater intervening activity; if acoustic 
presentation has produced a more durable code, it should give less impairment by reversed 
recall. Our earlier analysis however might have failed to reveal such an effect because it 
was swamped by the larger change in the opposite direction on the last groups. 

Data were therefore analysed separately for the first two groups presented, ignoring the 
last. The 28 subjects of Broadbent (1975) provided data on acoustic performance, and 20 
of the subjects of Expt 1 were used for the visual condition. The other subjects were left 
aside as having received an order of presentation conditions which did not occur in the 
auditory experiment, the design of the latter being similar to that of Expt 2. Thus the 
comparison of audition and vision was made for exactly the same materials, speeds, 
presentation order, and with recall either before or after the group now eliminated from 
analysis. 

In every subject, whether visual or auditory, reversed recall gave more errors; so the key 
question is whether this impairment is larger for audition or for vision. As the error score 
is proportional, and in some cases near ceiling levels of performance, some care is needed 
in making the comparison; as might be expected for such data the mean for each of the 16 
combinations of speed, recall instructions, sense organ, and order of presentation was 
heavily correlated with the variance under the same condition (tau — 0-64). For each 
subject in each condition therefore the error score was transformed to Z scores, which 
successfully eliminated this correlation (tau « 0-03). The Z score for backward recall was 
then subtracted from that for forward recall; this difference is significantly greater for 
audition than vision (U test, Z — 2-49, P — 0-013). It should be noted however that the 
significance is not satisfactory by analysis of variance, F — 2-98, d.f. — 1, 44, 

MSe — 0:544, 0:10 « P « 0-05. The reason for this unusual discrepancy is shown in the 
distribution of differences illustrated in Table 1; a few subjects show very large impairment 
by reversed recall, one in particular departing from the mean of the group by 3-40. Close 
inspection shows that these subjects were adopting a quite different strategy fom the others, 
only attempting to recall six items and therefore giving essentially zero performance on the 
last group recalled whatever its position of presentation. The large variance thus 
introduced disrupts parametric statistics but not non-parametric ones. 

It seems fair to conclude: (a) that the result obtained in a study of this type will depend 
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Table 1. Number of subjects showing each degree of difference between direct and reversed 
recall for the first two groups presented 





Differences 
in Z scores 
direct-reversed 0-0-0°5 0-5-1-0 10-15 1-5-20 2-0-2-5 2-5-3-0 3-0-35 3540 40-45 





Visual 5 5 6 2 1 0 0 0 I 
Auditory 3 1 10 10 3 0 1 0 





on the strategy adopted by the majority of subjects; (b) that if they adopt the strategy of 
producing only six responses there will be no difference in the effect of recall of vision or 
audition; (c), if, however, as in the majority of the subjects, they are attempting to 
reproduce all groups, then the impairment by reversed recall will be greater for audition 
rather than vision; (d) whatever the strategy, we have found no evidence for superior 
resistance to interpolated activity when presentation was auditory. 

These results are very difficult, if not impossible, to explain on the theory of a 
short-lasting PAS. Frankish (1976) suggests an alternative possibility, that acoustic 
information may still be available in an echoic store, even after long time intervals. To 
retrieve information from such a store, however, requires some physical cue which will 
allow the appropriate uncategorized items to be separated from the others in the acoustic 
buffer. When stimuli are presented at regular time intervals, in the same voice, and so on, 
only the last few items are marked out in a way which allows them to be retrieved from 
echoic storage at the time of recall. If, however, the stream of incoming stimuli is 
segmented by simple physical characteristics, such as the time intervals of the present 
experiments, then acoustic information may still be retrieved from echoic storage even after 
considerable time intervals. 

This view is an extension of the view of sensory buffer storage put forward by Broadbent 
(1971, pp. 353-354); according to which the buffer store is organized in a hierarchic way. 
Some features of an event are more fundamental than others, and they can indicate the 
region of buffer storage in which the stimuli are located; retrieval from the buffer takes the 
form of extracting all items possessing some one feature in order to form a fresh code from 
the combination of features possessed by that event. Thus the sensory buffer can contain 
quite large quantities of information, divided into separate regions by the basic or 
fundamental features of the sensory field. As time goes on it will become increasingly likely 
that fresh events will over-write those in any particular region, but when no such 
over-writing has occurred, echoic storage can still provide information even at the time of 
recall. 

Following this approach, we could argue that in auditory experiments the process of 
forming the group code does not always occur until recall, and that Stage 1 is a sensory 
buffer. Hence it is disturbed by recalling the last items first, and hence the relatively 
harmful effects of slowing down presentation. In the visual case, however, the group code 
is formed during presentation, from an output code, and hence the beneficial effects of 
reversed recall, and of a slower presentation making it easier to form the group code. The 
notion that echoic storage is still available at the time of recall is supported by data of 
Broadbent et al., (1977); they found a recency effect in audition which survived intervening 
response to visual stimuli but did not survive an intervening acoustic task, whereas the 
recency effect in vision was reduced by any activity whether visual or auditory stimuli were 
involved. Other related results have been found by Anderson & Craik (1974), by Gardiner 
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et al. (1974), Watkins & Watkins (1977), and by O. C. Watkins (1977). The notion of an 
echoic storage which persists for a very long time is markedly contrary to the views of the 
past decade; but, it can well be argued that the notion of echoic storage as relatively short 
term is due to concentration on experimental paradigms which use regular sequences of 
stimuli, similar in the fundamental or organizing physical features, and which therefore 
produce a great deal of over-writing of earlier events. For example, Madden & Bastian 
(1977) show little advantage for probing acoustic memory with a stimulus in the same 
voice as that of presentation rather than a different one; when the series of incoming 
stimuli is in fact always one voice. Craik & Kirsner (1974), however, showed a much more 
marked advantage when the original sequence of stimuli was broken up by the use of 
different voices. In everyday life the sequence of events is much less regular and 
monotonous than that used in laboratory paradigms, and it seems extremely plausible that 
echoic memory may be able to retrieve quite remote information, provided it is spoken in a 
pitch, from a location, or by a person, which have not been involved in any subsequent 
stimulation. Indeed, this is presumably the origin of the very prolonged double-take which 
so often occurs in ordinary life, and which an acoustic memory of only a second or two 
could ill explain. 

A last query remains; all our experiments show visual information as requiring 
immediate transfer to an output buffer, with no long-lasting iconic component. Yet 
although we have no evidence for a visual analogy to long-lasting echoic memory, this may 
be due to the particular experimental technique employed. Certainly visual grouping by 
intervals of time does not produce the same results as acoustic grouping by the same 
physical feature. However, all our visual events, as well as those of Broadbent et al. (1978) 
took place at the same spatial location. Although some other investigations (e.g. Nilsson et 
al., 1975) have varied the spatial location of presentation, they have done so with verbal 
materials, and without controlling rehearsal or output order, so that the relative amount of 
recall from recent and earlier events is contaminated by postcategorical and non-sensory 
encoding of the information. In studies of iconic storage, it is now conventional to use 
pattern masks at the same visual location as the original input. Perhaps location, rather 
than pausing in time, may act as the primitive feature for separation of items in the visual 
sensory store. Until this possibility is excluded, it is still conceivable that visual inputs may 
persist in a relatively primitive form of encoding, if there is no over-writing by other events 
similar 1n basic physical characteristics. 


Conclusions 


At the very least modality of arrival seems far more important in memory than was 
assumed a few years ago. If sensory registers (icons) are only temporary, the effects of their 
use can persist well into long-term retention. Many details of the data are far easier to 
explain if we suppose that sensory registers retain information, not for a brief period, but 
indefinitely until over-written by later events of the same type. 
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An investigation of memory functions in dyslexic children 
Hazel E. Nelson and Elizabeth K. Warrington 





A detailed investigation of short-term memory storage, long-term memory storage and semantic 
memory associated with developmental dyslexia is reported. Fifty-one dyslexic and 28 control 
children were tested, and the following deficits were found amongst the dyslexic children. Short-term 
memory functions were impaired both with respect to storage capacity and rate of decrement. 
Long-term memory functions for visual material were normal but verbal long-term memory functions 
were impaired. There was evidence of difficulties in incorporating new material into semantic memory 
but the rate of accessing information from within this system appeared to be normal. The significance 
of the memory deficits is discussed in relation to the hypothesis that childhood dyslexia is a double 
deficit of the graphemic-phonemic and graphemic-semantic reading routes. 





Despite the continuing debate about the status of developmental dyslexia there is general 
agreement with the observation that some children find learning to read and spell 
particularly difficult. In order to avoid prejudging the issue, for the purposes of this paper 
the term ‘developmental dyslexia’ is defined as ‘a failure to develop literacy skills to a level 
commensurate with the child’s age and level of general intellectual functioning that cannot 
be attributed to inadequate socio-cultural opportunity or teaching, to brain damage, or to 
some emotional or personality disorder’. 

Since literacy skills and intelligence are not perfectly correlated in the general population 
one would expect to find some children who are relatively slow in their development of 
reading and spelling skills, but the fact that there are more children with these difficulties 
than one would predict from statistical considerations (e.g. Yule & Rutter, 1976) suggests 
that they should not be regarded merely as members of the extreme of a normal 
population. Furthermore, in addition to their poor development of reading and spelling 
skills, dyslexic children characteristically have lower verbal than performance IQs on the 
WISC (e.g. Warrington, 1967; Naidoo, 1972), with the digit span subtest being particularly 
poor (see for example Rugel’s 1974 review). Parents often report slower than usual 
development of speech and language in their affected children (see Ewing, 1930; 
Warrington, 1967). Thus any theory concerning the cause of the reading and spelling 
difficulties should also be able to encompass these other characteristic features of 
developmental dyslexia. 

Many theorists accept the distinction, in normal subjects, between short-term storage 
and long-term storage (this terminology is adopted to denote theoretical memory systems 
rather than the temporal conditions of the experiment) (Waugh & Norman, 1965; 
Atkinson & Shiffrin, 1968; Baddeley, 1976). More recently separate semantic memory 
systems have been postulated (Tulving, 1972). Neuropsychological evidence provides strong 
support for the existence of such multiple memory systems; independent deficits of 
short-term storage, long-term storage and semantic memory have now been recorded in 
patients with cerebral lesions (cf. Warrington & Weiskrantz, 1973; Warrington, 1975; 
Shallice, 1979). Previous investigations of learning and memory functions in developmental 
dyslexia will be summarized briefly within this theoretical framework which is known to be 
relevant to the cerebral organization of memory systems. 

The characteristic finding of a relatively poor digit span in dyslexic children suggests a 
deficiency in some aspect of short-term memory storage. More detailed investigations of 
modality-specific and material-specific effects on span of apprehension in dyslexic children 
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have generally suggested that they are differentially impaired on tasks using linguistic 
material (e.g. Bakker, 1967), but no consistent pattern of difficulty has emerged on tasks 
which compare auditory and visual presentation (e.g. Standsteadt, 1964; cf. Senf & 
Feshbach, 1970). 

From the studies of long-term memory functions two trends emerge. First, in dyslexic 
children retention of visual non-verbal material tends to be relatively superior to their 
retention of verbal material (Guyer & Friedman, 1975). Secondly, whereas semantically 
organized material normally produces substantially better recall than non-organized 
material in normal children (Drew & Altman, 1970), this advantage is not so marked for 
dyslexic children (Freston & Drew, 1974; McManman et al., 1974). 

The English orthographical system, using only 26 letters and a variable set of 
phoneme-grapheme equivalents, is a potentially rich source of interference in the 
acquisition of reading and spelling skills. However, the possible roles of proactive and 
retroactive interference in developmental dyslexia remain unclear, some studies suggesting 
that dyslexic children may be less susceptible than normal to these effects (e.g. Shankweiler 
& Liberman, 1976) and others suggesting that they may be more susceptible (e.g. 
Farnham-Diggory & Gregg, 1975). 

Semantic memory functions (i.e. verbal knowledge systems) have not yet been 
investigated directly in dyslexic children. However, the consistent reports of relatively weak 
WISC vocabulary scores (e.g. Naidoo, 1972; Nelson & Warrington, 1974) would be 
compatible with impairment in one or more aspects of semantic memory. The role of 
semantic organization was briefly mentioned above in the context of long-term memory 
storage, and it is possible that the failure of dyslexic children to take maximal advantage of 
verbal mediation is reflecting an impoverished semantic memory system. 

The aim of the present study was to undertake a comprehensive investigation of memory 
functions in a group of dyslexic children. The rationale for the selection of the individual 
tests was provided by a multiple systems view of memory functions, and the tests were 
adapted from techniques available for the investigation of memory in adults. 


Method 
Subjects 


The dyslexic group. 'The dyslexic group comprised 51 children drawn from a consecutive series of 
referrals to the Psychology Department of the National Hospital, Queen Square, during a 15 month 
period: all the children had been referred for educational difficulties. 

The following criteria were set for inclusion in this group: 

(i) age between 8-0 and 14 years; 

(ii) either (a) the WISC verbal, performance and full scale IQs were a// higher than 85; or (b) the 
WISC verbal or performance IQ was between 75 and 85 provided that the full scale IQ was more 
than 90; 

(iit) (a) if the WISC full scale IQ was 86-100 then both reading and spelling ages had to be more 
than 2 years below CA; 

(b) if the WISC full scale IQ was 101—115 then both reading and spelling ages had to be at 
least 1j years below CA; 

(c) if the WISC full scale IQ was 116+ then both reading and spelling ages had to be at least 
1 year below CA, 

(iv) there had to be no evidence of inadequate schooling or poor teaching, emotional or psychiatric 
disturbance, and no neurological evidence of brain damage. 

In fact none of the children seen during this period had to be excluded from this study on the 
grounds of low WISC results One child had to be excluded because he was found to have an 
epileptic focus, and a family of two children were excluded on the grounds of probable psychiatric 
disturbance. The remaining children seen during this period who were omitted were excluded either 
on the grounds of age, or because their reading and spelling abilities were not sufficiently retarded. 
For reasons extraneous to the investigation not all dyslexic children in this group completed every 
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experimental test, but in every case the particular group of dyslexic children completing a memory 
test matched the control group in age and intelligence factors (see Nelson, 1978). 


The control group. The control group comprised 26 children whose reading and spelling attainments 
were not more than 6 months below their chronological age. They were selected to be matched with 
the dyslexic group for sex composition, for educational background (state vs. private) and for age. 

A common problem in neuropsychological studies, and one which applies to the present study, is 
how to match a control group to the experimental group when one knows (or suspects) that specific 
aspects of intellectual functioning are impaired in the experimental group. The normal solution is to 
match on the unimpaired aspects of intellectual functioning. In the case of dyslexic children the 
verbal IQ of the WISC is characteristically lower than the performance IQ (see for example Rugel's 
1974 review) implying that there is some relative impairment in verbal IQ in dyslexic children and 
therefore that the intellectual matching of dyslexic and control children should be on the basis of 
performance IQ. In the present study testing time for the control subjects was limited so only one 
subtest could be administered from the performance scale; Picture Completion was selected on the 
grounds that this is the subtest which the majority of studies agree appears to be unaffected in 
dyslexia. The Similarities subtest was also administered because although dyslexic children tend to 
score lower on all verbal subtests Similarities appears to be the least affected, and it was considered 
feasible to obtain a control group that was closely matched to the dyslexic group on Picture 
Completion scores and that did not differ significantly on Similarities scores. (It should be noted that 
one would expect a control group selected to match a dyslexic group on Picture Completion and 
Similarities subtest scores to be superior on those aspects of intellectual functioning (e.g. digit span 
and vocabulary) which are specifically impaired in dyslexia.) 


Testing procedure 


The children in the dyslexic group were tested in two sessions on the same day. In the first session the 
full WISC (Digit Span being substituted for Comprehension) was administered followed by the 
Schonell Graded Word Reading and Spelling tests and the English Picture Vocabulary test (EPVT). 
The remaining six memory tests (see below) were administered in a fixed order (Tests 4, 5, 3, 8, 6, 2) 
in the second session. 

The children in the control group were tested in one session. Three WISC subtests (Picture 
Completion, Similarities and Digit Span) were followed by the Schonell reading and spelling tests, 
then the EPVT and the other six memory tests (also 1n the order 4, 5, 3, 8, 6, 2). 


Group matching and literacy skills results 


Details of educational background, sex composition and age for the dyslexic and control groups are 
given in Table 1; the appropriate statistical tests showed that there were no significant differences in 
the composition of the two groups with regard to these factors. 

The WISC results for the dyslexic group are given in Table 2 (the full WISC was not administered 
to control subjects). The verbal/performance discrepancy is consistent with that found in previous 
investigations (e.g. Warrington, 1967) and suggests that the present dyslexic group 1s representative of 
the dyslexic population as a whole. 


Table 1. Educational background, sex composition and age of the dyslexic and control 
groups 








Dyslexic group Control group 
(n = 51) (n = 26) Stat test P 
Educational background 
State: Private 31:20 16:10 x7 = 0-1 n.s 
Sex composition 
Male: Female 39:12 20:6 x? = 0-1 n.s. 
Age (years) mean 11-4 11-5 t=0-4 n.s 


(SD) (1-6) (1-4) 
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Table 2. Results of WISC tests in the dyslexic and control groups 











Dyslexic group Control group 

(n = 51) (n = 26) 
WISC Mean SD Mean SD t P 
Verbal IQ 100-2 11-4 
Performance IQ 110-6 10-2 
Full-scale IQ 105-6 10-3 
Similarities subtest 12-4 2.5 13-4 2-0 18 n.s. 
Picture completion subtest 11-1 2-8 11-7 2:8 0-9 n.s. 
With control group scores regressed to mean 
Sımilarities subtest 12:4 2:5 12:7 1-6 0-6 n.s. 
Picture completion subtest 11-1 2:8 112 1-8 02 n.s. 





The results of the Similarities and Picture Completion tests for the dyslexic and control groups are 
given in Table 2: the t tests indicated no significant differences between the two groups on these tests. 
If one considers regression effects to be operating because of the way in which the control group was 
selected ın order to match an above-average experimental group then an even closer group matching 
was achieved (see Table 2). Most of the memory experiments were based on subsamples of the 51 
dyslexic children, but for every memory test the subsample of dyslexic children matched the control 
children for sex composition, age, Similarities and Picture Completion scores. 

Post hoc inspections of the intertest product moment correlations ın the dyslexic group showed 
that except for digit repetition (r = 0-58, P « 0-01) and picture vocabulary (r = 0-67, P < 0-01) the 
WISC IQ was insignificantly related to the memory test results (no r more than 0-3). Thus for the 
majority of memory tests it appears that IQ is not a significant determinant. 

The attainment scores and standard deviations on the literacy tests for the two groups are given in 
Table 3. The marked and highly significant differences between the dyslexic and control groups 
merely reflect the criteria for group selection and need no further comment. It should be noted, 
however, that in this dyslexic group the degree of reading retardation 1s nearly as great as the degree 
of spelling retardation. 


Table 3. Results of Schonell literacy tests 











Dyslexic group Control group 

(n — 51) (n — 26) 

Mean SD Mean SD t P 
Schonell GWRT 
Reading age (years) 82 1-7 14-0 2:1 12.9 0-001 
Reading retardation relative 3:2 0-9 —2:5 1:5 17-1 0-001 

to CA (years) 

Schonell GWST 
Spelling age (years) 777 1-4 13-5 2-0 14-7 0-001 
Spelling retardation relative 3-7 1-0 —2-0 1-5 11:8 0-001 


to CÀ (years) 
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Investigation of short-term memory storage 
Test 1 — Digit Span 


Digit span is a measure of the capacity of auditory-verbal short-term storage, and Rugel's 
(1974) extensive review of WISC profile studies showed that dyslexic children 
characteristically perform very poorly on this task. 


Procedure. The Digit Span subtest of the WISC was administered according to the standard 
procedure and in addition to recording the age-scaled scores for each subject the actual 
numbers of digits repeated forwards and backwards were also recorded. 


Results. Comparing the age-scaled scores of the dyslexic group with the WISC normative 
data it can be clearly seen that despite being of above-average IQ generally the dyslexic 
group are relatively impaired on the digit span subtest. 

The mean numbers and standard deviations of digits repeated forwards and backwards 
for the dyslexic and control groups are given in Table 4. The dyslexic group are very 
significantly poorer than the control group in both these conditions (see Table 4). The 
ANOVA confirmed the significant groups effect but show no significant tests x groups 
interaction (F — 3-8, n.s.). This latter result indicates that the dyslexic group was not 
relatively more impaired in either the forwards or backwards condition. 


Table 4. Results of memory test 1, memory span for digits 


Dyslexic group Control group 

(n = 51) (n = 26) 

Mean SD Mean SD t P 
Digits forwards 5-04 1-23 6-71 0-95 6:4 0 001 
Digits backwards 3-65 0-98 4:54 1-17 3:3 0-002 
Digit span (WISC age-scaled) 8-24 245 12-65 2:77 6.9 0-001 





Test 2 — Brown-Peterson paradigm 


Although the capacity of auditory-verbal short-term storage in dyslexic children has been 
relatively well investigated it is not known whether or not in a delayed recall condition the 
rate of decrement in performance would be abnormal. The Brown-Peterson short-term 
forgetting task provides a measure of short-term retrieval and was used to investigate this 
aspect of short-term storage in dyslexic children. 


Procedure. Following the experimental paradigm described by Peterson & Peterson (1959) 
recall of pairs of letters was tested after short intervals filled with a distractor task to 
prevent rehearsal. The test was divided into four blocks each comprising 12 pairs of letters 
drawn at random from a pool of 32 consonants (the 21 consonants+11 repeats), the only 
constraints being (a) that no letter recur before the intervention of two other letter pairs 
and (b) that no letter pair form a well-known abbreviation (e.g. GB). Each subject received 
a uniquely selected random set of items. 

For the first and fourth block of letter pairs the stimuli were presented auditorially at a 
rate of one per second; for the second and third blocks the stimuli were presented visually 
(black Letraset capitals, 4 cm high, printed on blank white playing cards) also at a one per 
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second rate. In order to prevent overt or subvocal articulation during the visual 
presentation of the material the subject was required to repeat rapidly his own name. 

Retention intervals of 0, 5, 10 and 15 s were randomly assigned within consecutive 
blocks of four items; thus each interval was used six times for each presentation condition. 
The distractor task was to add one to each of a random series of digits which were placed 
before the subject as rapidly as he could deal with them. 


Results. The mean numbers of letters recalled (irrespective of order) for each interval and 
presentation condition for the dyslexic group and control group are shown in Fig. 1. An 


Number of letters recalled 





Retention interval 


Figure 1. Results of memory test 2- Brown-Peterson paradigm: @——-@, control—auditory; 
X —— x, dyslexic-auditory; & —- 6, control-visual; x ——— x , dyslexic-visual. 


analysis of variance was computed; the group difference was highly significant (F — 23-7, 
P < 0-001) and the difference between visual and auditory presentation was also significant 
(F = 11-44, P « 0-01). However, the groups x conditions interaction term was not 
significant (F — 0-44, n.s.) which indicates that there was no differential effect of stimulus 
modality for the dyslexic group. The dyslexic group was significantly worse than the 
control group for retention intervals of 5, 10 and 15 s under both auditory presentation 

(t = 4:3; 6-2; 4-9 respectively, all P < 0-001) and visual presentation (t = 4-3; 4-1; 3-6 
respectively, all P « 0-002). 


Investigation of long-term memory storage 
Test 3 — Recognition memory for words and faces 


Retention tests by recognition memory are particularly suitable for the direct comparison 
of visual and verbal long-term memory. À forced-choice recognition memory test for 
words and faces was adapted from a paradigm introduced by Warrington (1974) for use 
with adult subjects. 
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Procedure, The stimuli for the verbal section of the test were 40 high frequency, low 
concrete words (Coughlan, 1976) and the stimuli for the visual section were 40 
photographs (unknown actors from Spotlight magazine, 10-5 x 8 cm). The words were 
spoken aloud by the experimenter at a one per 3 s rate, and the subject was required to 
repeat each word once. The photographs weie also presented at a one per 3 s rate and the 
subject was required to judge each face as being young or old. In the retention test each 
stimulus item was paired with a ‘new’ distractor item (e.g. warmth — vision; shy — faster) 
and the subject was required to select the original stimulus item (no time limit was 
imposed). The same order of the stimulus items was used for the presentation and the 
retention test. The order of administration of the two sections of this test was alternated 
according to a balanced test design. 


Results. The means and standard deviations of items correctly identified in the dyslexic and 
control groups are given in Table 5. The results of the t tests show that the dyslexic group 


Table 5. Results of memory test 3, mean numbers of items recognized 











Dyslexic group Control group 

(n = 36) (n — 26) 

Mean SD Mean SD t P 
Words 3244 2:8 35:2 3-1 3-8 0-001 
Faces 34:3 2:8 348 3:0 0-7 n.s. 


was significantly worse than the control group on the verbal section of the test but that 
there was no significant difference between the two groups on the visual section. The 
ANOVA comparing groups and conditions was computed and the significant 

groups x conditions interaction (F = 4-15, P < 0-05) confirmed the differential effect of 
stimulus material for recognition memory in dyslexic children. 


Test 4— Release from proactive interference 


Most investigations of interference phenomena use paradigms based on paired-associate 
learning, but in our experience such verbal paired-associate learning tasks were often 
stressful for children. The release from proactive interference technique seemed to be 
equally appropriate but less subject to anxiety factors. Build-up of PI can be demonstrated 
with successive recall tasks when the stimulus items are drawn from the same taxonomic 
class — similarly a ‘release’ from this PI can be obtained by a taxonomic shift in the 
stimulus items. The present test was adapted from a paradigm described by Craik & 
Birtwhistle (1971) which demonstrated that both the build-up of PI and the release from PI 
are long-term memory phenomena. 


Procedure. The 10-word stimulus lists were either names of animals or names of fruit and 
vegetables which were selected randomly from a pool of 30 common animals and 30 
common fruit and vegetables, each subject having an individually selected set of stimulus 
items. Three lists of animal names followed by one list of fruit and vegetable names or 
three lists of fruit and vegetable names followed by one list of animal names were 
administered to each subject, the two orders being alternated according to a balanced 
design. The lists were read by the experimeter at a one word per 2 s rate and each time the 
subject was told whether the list would be names of animals or names of fruit and 
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vegetables. Free recall was attempted after a 30 s interval, filled with a distractor task of 
adding one to each of a random series of numbers as rapidly as possible. 


Results. The mean numbers of items recalled for each of the lists for the dyslexic group and 
the control group are shown in Fig. 2. In both groups there is a clear decrement in 


Number of items recalled 





List recall 


Figure 2. Results of memory test 4— build-up and release from PI: @——-@, controls; x —x, 
dyslexics. 


performance over the first three lists, and for both groups an increased recall score on the 
fourth list (the taxonomic category shift list). Thus the normal effect of build-up of PI and 
release from PI was obtained in both groups. The ANOVA confirmed the highly significant 
difference in list conditions (F, = 30, P < 0-001) but showed that the group difference was 
not significant (F — 2-77, n.s.), nor was there a significant groups x conditions interaction 
term (F — 0-32, n.s.). Thus there was no evidence that the dyslexic group are more or less 
susceptible than the control group to inter-list interference effects. 


Test 5 — Cross-modal learning and the effects of interference 


In order to investigate the effects of interference in a recall task where no verbal mediation 
strategy is apparent or suggested by the experimenter, a task was devised in which the 
subject had to learn to associate proper names with photographs of children's faces. By 
using different sets of faces and names it was possible to investigate three PA paradigms, 
viz. (1) a simple AB paradigm, (ii) an AB-AB, paradigm, which is known to produce large 
inter-list interference effects and (iii) an AB,C, paradigm, which has been found to 
produce large intra-list interference effects (Nelson, unpublished data). 


Procedure. Three sets of PA items were administered to all subjects. The stimulus for each 
item was the photograph of a child's face (approx. 10x 8 cm, obtained from Spotlight 
magazine) and the required response was a name. Each set of four items started with a 
learning trial and this was followed by a test trial; the procedure continued with alternate 
learning and test trials until the criterion of all items correct in two consecutive test trials 
was reached, or until six test trials had been completed. The total number of errors made 
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was recorded for each set. Within each learning trial the four stimuli were presented in 
random order, at the rate of one per 3 s, and as each face was presented the examiner gave 
the correct name which was repeated by the subject. Within each test trial the four stimuli 
were presented in random order and 6 s were allowed for the subject to give a name in 
response. The subject was encouraged to guess if uncertain; he was not told whether or not 
his response was correct. Randomization of stimuli within trials was only restricted by the 
considerations that the first item of a learning trial could not be identical to the last item of 
the preceding test trial, and vice versa. 

The first set of items, for the simple AB paradigm, consisted of photographs of four girls 
whose forenames (Anne, Helen, Jane and Linda) were designated by the examiner. The 
second set of items, for the AB-AB, paradigm, consisted of photographs of four boys 
whose forenames were designated by the individual subject (this procedure was adopted to 
facilitate the AB list learning). As soon as this list had been learned to criteria the four 
names were re-allocated to the four photographs by the examiner to form the AB, list 
(names 1, 2, 3 and 4 going to faces 3, 1, 4 and 2 in order to avoid any simple transfer 
scheme that might have been detected by the subject). The set of items, for the AB,C, 
paradigm consisted of four different girls’ photographs (chosen to be as different as possible 
from the first set) which were given the names Susan Smith, Mary Smith, Susan Brown and 
Mary Brown by the examiner. 


Results. The means and standard deviations of errors in the dyslexic and control groups for 
each of the three PA paradigms are given in Table 6. The results of the tests show that the 
difference between the two groups in the simple AB paradigm (learning girls' forenames) 
almost reached the 10 per cent significance level whilst in the slightly different procedure 
adopted for the AB-AB, paradigm the first learning of the dyslexic group was significantly 
poorer than that of the control group. In the latter paradigm, when the names—faces 
associations were changed (i.e. the AB, list) the errors made before the new associations 
were learned were not significantly different in the two groups, and neither were the 
AB, — AB list scores (see Table 6). The dyslexic group were significantly worse than the 


Table 6. Results of memory test 5, mean numbers of errors to reach criterion 








Dyslexic group Control group 

(n — 32) (n — 26) 

Mean SD Mean SD t P 
AB 
Girls’ forenames 6:75 6:81 4.15 5-26 1:64 n.s. 
AB—AB, 
Boys’ forenames (AB) 2:25 3-28 0-88 1-27 2:17 0-05 
Boys’ forenames changed (AB,) 6-19 5-62 6:92 5.18 0:52 n.s. 
AB,—AB 3-93 — 6:04 — 0-96 n.s. 
AB,C, 
Girls’ doublenames 11-28 5-83 7.92 4-94 2:38 0-02 








control group in the complex AB,C, condition (learning the confusing double names). 
However, in order to allow for possible group differences in general PA learning ability an 
ANOVA was computed on the AB,C, and simple AB paradigm scores. This confirmed 
that the dyslexic group made more errors than the control group (F = 242, P < 0-01), but 
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produced no significant tests x groups effect (F = 0-24, n.s.), which suggests that the group 
difference in the AB,C, paradigm was a reflection of the generally weaker PA learning 
ability in the dyslexic group and not an abnormal susceptibility to the effects of 
interference per se. 


Investigation of semantic memory functions 
Test 6 — Vocabulary level 


Knowledge of word meanings is a central component of semantic memory and it is known 
that dyslexic children score relatively poorly on tests of vocabulary (e.g. Nelson & 
Warrington, 1974). But the word vocabulary of dyslexic children has hitherto been 
estimated using tests which require word definitions, so it is possible that the impairment 
demonstrated in dyslexic children is merely reflecting a difficulty in verbal expression rather 
than in vocabulary size per se. The English Picture Vocabulary test (EPVT) was chosen to 
provide an alternative means of estimating knowledge of word meaning. It must be 
emphasized that although vocabulary level is a good indicator of general intelligence level 
in normal children it cannot be used as a valid indicator of general intelligence level in 
dyslexic children. Just as it would be quite unsuitable to use the EPVT to give an estimate 
of general level of intellectual functioning in a dysphasic patient, so if vocabulary level is 
specifically impaired in dyslexia then the EPVT will underestimate intelligence level in 
dyslexic children. 


Procedure. The EPVT was administered according to the standard instructions described in 
the manual. A preliminary estimate of ceiling level was determined according to the criteria 
set out in the manual (five incorrect responses in eight consecutive items). The test was 
then continued past the ceiling level to obtain a pool of eight unknown words (used in 
Expt 7). In so doing ıt was found not infrequently that a new basal level was recorded (i.e. 
five correct responses in eight consecutive items), and in these cases the testing was 
continued until the next ceiling level was attained and the vocabulary score calculated from 
the lower basal level and higher ceiling level. Whilst the inclusion of the additional items 
will give a fairer picture of the child’s vocabulary level, it should be noted that this 
procedure will tend to produce higher raw scores than if the manual instructions were 
rigidly adhered to. 


Results. The raw scores were converted into IQ equivalent from the test manual in order to 
provide an age-scaled score for each subject. For the reason given above these IQ 
equivalents cannot be considered reliable estimates of general IQ levels, and indeed the 
EPVT raw scores for the control group give a mean IQ equivalent in the superior range as 
compared with the bright average level given by their WISC results. 

The means and standard deviations of the age-scaled scores for the dyslexic and the 
control groups are given in Table 7. The results clearly show that the dyslexic group is 


Table 7. Results of memory test 6, age-scaled equivalents of vocabulary scores 





Dyslexic group Control group 
(n = 32) (n — 26) 
Mean SD Mean SD t P 





EPVT (IQ equiv.) 1047 — 215 123-8 — 16:5 3-8 0-005 
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worse than the control group on this measure of word knowledge which requires no verbal 
definitions to be formulated by the subject (see Table 7). 


Test 7 — New word learning 


An individual's vocabulary level is a very stable knowledge system which develops 
throughout childhood by incorporating new information into the existing semantic 
systems. In view of the relatively limited vocabulary of dyslexic children, it is relevant to 
consider whether the processes mediating the acquisition of word meanings might be 
impaired in dyslexic children. 


Procedure. Each subject was required to learn the meanings of eight new words which were 
just above the ceiling level of his current vocabulary. Thus the test items for each subject 
were individually selected, and comprised the five errors that determined the ceiling level 
on the EPVT plus the next three errors made as the test was continued past that ceiling 
level. The eight EPVT pictures which illustrated the word meanings were cut out from the 
test booklet. (N.B. Two subjects completed the EPVT with fewer than eight errors, and in 
these cases the errors made were supplemented by nonsense words which were given 
meanings appropriate to the supplementary pictures.) The definitions of the eight words 
were read aloud to the subject, following which four of the pictures were displayed in a 
2x2 array. The experimenter spoke the four stimulus words in a random order and after 
each word asked the subject to point to the appropriate picture. The subject was not told 
whether or not he was correct. The remaining four pictures were then displayed and the 
testing procedure repeated. If any one of the picture-word matches was incorrect the whole 
procedure was repeated for a second trial using similar but not identical word definitions 
and a different order of presentation for these definitions. If any error was made on the 
second trial a third and final trial was administered. A measure of new word learning 
ability was given by the total number of picture-word match errors. 


Results. The means and standard deviations of the errors made by the two groups are 
given in Table 8. In view of the skewed distributions present in both groups a 


Table 8. Results of memory test 7, mean numbers of errors to criterion 





Dyslexic group Control group 

(n = 32) (n = 26) Mann-Whitney 

Mean sD Mean SD t P U Z P 
NWLT 5.81 4-96 2:19 2-36 98 0-001 592 2 79 0-001 





non-parametric test for group difference was applied as well as the z test. The dyslexic 
group as a whole clearly has more difficulty than the control group in new word learning, 
though it should be noted that the distribution of error scores does raise the possibility that 
this may not be an inherent feature of the dyslexic population. 


Test 8 — Rate of access to picture names and category information 


An important aspect of any memory system is the speed at which information can be 
accessed from within it. Thus in the case of the semantic memory system it would seem 
appropriate to investigate not only vocabulary size and rate of acquisition but also the 
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ability to operate or access this system. Therefore a series of tests was devised to investigate 
the speed of naming objects and of making semantic decisions about superordinate and 
subordinate category information. 


Procedure. In order to obtain a measure of each individual subject’s speed of vocalizing a 
simple yes/no response, the main part of Test 8 was preceded by a simple reaction/response 
time test. Following four practice items, eight squares (four red, four black) were randomly 
presented by a tachistoscope each for a 2 s exposure, and the subject gave a yes/no 
response to the question ‘Is the square red?’ The fastest and slowest response times were 
omitted and a mean simple response/reaction time was calculated for each subject from the 
remaining six items. 

The test stimuli for the main part of Test 8 were common objects, pictures of which were 
obtained from an EPVT booklet. Two parallel sets of 16 objects (each including four 
practice items) were selected according to a balanced design in which half the objects were 
living and half were larger than the children acting as subjects. Each subject received both 
sets of stimuli, one set being presented visually and the other set auditorially, the mode and 
order of set presentation being determined by a 2 x 2 counterbalanced design. 

In the visual condition the set of objects was presented tachistoscopically, each picture 
being exposed for a 2 s period. The set was presented three times. On the first ‘run’ the 
subject was required to name each object as rapidly as possible. Half the subjects were then 
required to make a yes/no response to the question ‘Is it living?’ to each picture on the 
second ‘run’, and to the question ‘Is it bigger than you?’ to each picture on the third 
‘run’. The other half of the subjects answered the questions in reverse order. 

In the auditory condition the set of objects was read aloud by the examiner. The set was 
presented three times. On the first ‘run’ the subject merely listened to the words. For the 
next two 'runs' the same questions were asked in the same order as in the visual 
presentation condition. 

The push button operating the tachistoscope also started the timer, otherwise timing was 
manual and recorded to the nearest 0-1 s. For each run of items the fastest and slowest 
response times were omitted and a mean response time calculated from the remaining 10 
items. For each subject, the mean simple response/reaction time was subtracted from each 
of the mean response times to give mean decision times on the naming task and each of the 
four semantic decision tasks. 


Results. The means and standard deviations for the simple reaction/response task, the 
naming task and the total of the four semantic decision tasks for both groups are given in 


Table 9. Results of memory test 8, mean response times for simple reaction time task and 
mean decision times for naming and semantic categorization tasks 











Dyslexic group Control group 
(n = 32) (n = 24) 
Mean SD Mean SD t P 
Red/black response (s) 0.78 0-19 0-73 0-15 1:06 n.s. 
Picture naming decision 0-34 025 0-36 0-16 0-39 n.s. 
time (s) 
Total of four semantic 1-47 0-75 1-25 0-53 1-20 n.s. 


decision times (s) 
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Table 9. (Only the total of the four semantic decision times is considered here, but there 
were no significant differences between the two groups in any of the four individual 
decision tasks — see Nelson, 1978.) The results of the t tests indicated no significant 
differences between the two groups on the naming and decision asks, which implies that 
the dyslexic children could name very common objects (e.g. dog and ball) as rapidly as the 
non-dyslexic children, and that the times to access superordinate and subordinate category 
information were not impaired in the dyslexic children. 

Discussion 

The three main findings to emerge from this investigation are that (1) dyslexic children are 
impaired both in a capacity and a temporal measure of short-term storage, (2) dyslexic 
children have a modality-specific deficit of long-term memory storage, and (3) although 
access to the semantic memory system appears to be normal, acquisition of new 
information to the semantic system is impaired. First we shall consider the interpretation of 
the memory test results in terms of multiple-memory systems deficits and their anatomical 
implications. Secondly we shall consider how the memory deficits demonstrated in dyslexic 
children could affect the acquisition of reading skills. 

The dyslexic children in the present study were impaired on the digit span measure, a 
result that is highly consistent with all previous studies. Furthermore on the 
Brown-Peterson short-term memory forgetting task there was a very clear-cut deficit in the 
dyslexic group on both the auditory and the visual version of the test. Neuropsychological 
evidence points to the existence of separate and independent visual-verbal and 
auditory-verbal short-term memory systems (Warrington & Shallice, 1969; Warrington & 
Rabin, 1971; Warrington & Shallice, 1972) and evidence deriving from studies of deaf 
children (Conrad, 1972; O'Connor & Hermalin, 1973) and normal adults (Kroll et al., 
1970; Levy, 1971) can also be interpreted in terms of modality specificity of short-term 
memory storage. It is now widely accepted that articulatory suppression effectively prevents 
the normal strategy of recoding visually presented letters in an acoustic form, so that in the 
former condition it 1s assumed that recall must be from a visual-verbal system. In the 
present study it is considered that the children's rapid repetition of their own names would 
effectively prevent acoustic recoding and therefore that the findings using the 
Brown-Peterson paradigm implicate not only the auditory-verbal but also the 
visual-verbal short-term memory system. It is concluded from this investigation of 
short-term memory functions in dyslexic children that their short-term storage systems are 
impaired with respect to two important parameters, namely those of capacity and time 
constants. 

The dyslexic group was impaired on the forced-choice recognition memory test for 
auditorially presented words, whereas on a similar test of recognition memory for 
photographs of faces their performance was at a normal level. Each recognition judgement 
was made after 40 intervening items so that the retention tests were well outside the limits 
of short-term storage. This finding of a verbal/visual discrepancy in long-term storage is 
comparable to the results of Guyer & Friedman (1975) using complex pictorial material. 
Furthermore, although there was no evidence for enhanced inierference effects in the 
dyslexic group, their performance on the cross-modal visual-verbal paired-associate tests 
was generally weaker than that of the normal control group. 

In contrast to their impairment on the verbal forced-choice recognition test and on the 
PA learning tests the dyslexic group was not significantly worse than the normal control 
group in recalling the word lists of the PI release test, a test which also clearly involves 
long-term storage. In so far as there was neither a ceiling nor a tloor effect on this latter 
test it seems unlikely that the failure to differentiate between the dyslexic and the control 
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group can be attributed to this test providing an insensitive measure of long-term storage. 
One important and possibly relevant difference between these tests of verbal long-term 
memory is that the semantic organization of the stimulus material was made quite explicit 
before each PI list was given, whereas no semantic orientating task was demanded for the 
verbal forced-choice recognition list. Thus although this study provides no support for the 
view that dyslexic children are less able than normal to take advantage of semantic 
organization to assist recall, a difficulty in initiating verbal mediation would be consistent 
with the present findings. 

Knowledge of word meanings was investigated in an attempt to assess the efficiency of 
certain aspects of semantic memory in dyslexic children. The word/picture matching test 
(EPVT), on which the dyslexic group was impaired, provides a measure of the limits of 
development in the acquisition of word meaning. The development of a child's semantic 
system requires not only that he has the opportunity to learn new word meanings but also 
that he has within his repertoire the abstract concepts necessary to understand the full 
meaning of the new words. In the present new-word learning test the stimulus items were 
words at the upper limit of each subject's word/picture vocabulary so that it is not 
unreasonable to assume that the child possessed the concepts necessary for their 
comprehension and therefore that integration of tnese new words into his existing semantic 
structure could occur. In this test the procedure devised by Walton & Black of giving the 
meanings of words in different terminology from trial to trial was adopted: this procedure 
enhances the role of semantic organization rather than long-term rote learning. Therefore we 
would argue that the results of this test, when taken together with their overall poorer 
vocabulary levels, indicate that the dyslexic children are impaired in their ability to 
incorporate new material within their semantic knowledge systems. In contrast, the failure 
to find any significant difference between the groups in their speed of naming and in 
category decision times suggests that the dyslexic children may have normal access to 
semantic information. However, the fact that all the objects in this latter test were high 
frequency nouns does not permit this conclusion to be generalized to include object names 
less well established in semantic memory. 

What, if any, 1s the relationship between the short-term storage and long-term storage 
deficits that have been demonstrated in the dyslexic group? In our view the distinction 
between short-term and long-term memory systems is amply justified both by experimental 
studies in normal subjects and from neuropsychological data, and on these grounds it 
seems probable that the deficits on the present short- and long-term memory tasks in the 
dyslexic children are independent phenomena. 

The interrelationship between long-term memory and semantic memory is far from clear, 
but it seems likely to be a complex one. For example, the efficiency of verbal mediation 
which can be brought to a long-term memory task may be limited by the subtlety of the 
child's existing semantic systems, whilst on the other hand long-term retention of 
associations or context may be important for the normal development of semantic memory 
systems. Thus the possibility must be considered that the long-term and semantic memory 
deficits in the dyslexic group are related rather than independent phenomena. If efficient 
long-term memory functions were essential for the maintenance of an item during 
acquisition in a more permanent form in semantic memory, then this could account for the 
different pattern of results in the tests maximizing acquisition to semantic memory and 
those maximizing access to semantic memory. 

Neuropsychological evidence not only supports the notion of multiple memory systems 
but suggests there is a separate anatomical basis for each system. In the adult, lesions in 
specific post-rolandic regions of the left hemisphere result in verbal memory deficits. The 
impairment of short-term storage for auditory-verbal material 1s associated with left 
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inferior parietal lobe lesions (Warrington et al., 1971) whilst the critical area for 
visual-verbal short-term storage may be more posteriorally placed (Kinsbourne & 
Warrington, 1963). Left temporal lobe lesions can give rise both to verbal long-term 
memory deficits (e.g. Milner, 1971) and with more posteriorally placed lesions to verbal 
semantic memory deficits (Coughlan & Warrington, 1978). Thus all the deficits observed in 
the dyslexic group occur in functions mediated by the post-rolandic regions of the 
dominant hemisphere. However, it is now generally agreed that, by definition, dyslexic 
children are neurologically normal and have not sustained any brain damage (at least at a 
macroscopic level). Therefore we would suggest that the memory deficits in the dyslexic 
child are best understood in terms of functional inefficiencies of the relevant cortical 
substrates in the post-rolandic regions of the dominant hemisphere, and that they can be 
encompassed by the notion of both inherited and spontaneously occurring congenital 
disabilities. 

It is appropriate to consider the memory deficits demonstrated in the present group of 
dyslexic children in the context of the ‘two-route’ theory of normal reading. It is now 
widely accepted that phonological mediation is not necessary for skilled reading of 
individual words in the adult. Indeed it appears that efficient reading generally involves the 
‘direct’ route by which the semantic representation of the word is accessed directly from the 
visual word form. This ‘sight’ vocabulary, which is very large in the adult reader, becomes 
essential for reading orthographically irregular words (e.g. colonel, bury) and for 
disambiguating homophones (e.g. sale, sail). But clearly the graphemic-phonological route 
which transforms the visual word form to a phonological code prior to attaining meaning 
could be used for the large majority of words, and it is clearly essential for the reading and 
subsequent understanding of non-word homophones (e.g. brane). Patients with acquired 
dyslexia provide a firm neuropsychological basis for the distinction between these two 
modes of reading. Independent deficits both of phonological recoding and of reading direct 
for meaning without phonological mediation have been described (Warrington, 1975; 
Beauvois & Derouesne, 1979; Shallice & Warrington, 1979). 

Even quite young children have a limited sight vocabulary. This ‘direct’ route reading 
has been shown to be operative for familiar words in children as young as 6-8 years old 
(Frith, 1979). But with only a small ‘sight’ vocabulary the young child must necessarily be 
very dependent on phonological mediation to read new words and so extend his sight 
vocabulary. Therefore it seems plausible to assume that for the acquisition of normal 
reading skills both the ‘direct’ route to meaning and the phonological route are essential. 

Leaving aside the problem of whether or not there are separate meaning systems for the 
spoken word and the written word, accessing a semantic memory system is at the core of 
‘direct’ route reading. In the present study it has been shown that although the dyslexic 
group was able to access well-established semantic information at a normal rate it was 
significantly worse than the control group in the acquisition of new information. It is 
suggested, therefore, that it is appropriate to consider the reading retardation in terms of a 
slower than normal rate of acquisition of sight vocabulary as lexical entries. Thus not only 
is the auditory-verbal vocabulary limited in dyslexic children but so also is their 
visual-verbal vocabulary. We would argue, then, that reading retardation in part may be 
accounted for by a more general impairment affecting the normal acquisition to semantic 
memory systems. 

Whether or not the phonological mediation of indirect route reading operates at the level 
of graphemic units or syllabic units, the phonemic elements must be held in some form of 
temporary storage in order to recode the whole word in phonemic form. In view of the 
duration and accuracy of the storage required, and more particularly in view of the 
phonological nature of the material to be stored, the auditory short-term storage system 
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would appear to be the most appropriate for this function. Dyslexic children appear to be 
normal in their knowledge of letter sounds (Nelson, 1978), but the dyslexic group in the 
present study were impaired both in the immediate and delayed retrieval from short-term 
storage. Therefore, it is suggested that the dyslexic child may be unable to hold the 
phonemic transformations of a visual word form with sufficient accuracy or for a sufficient 
length of time to succeed in reblending it as a whole word. Clearly short-term memory 
deficiencies could limit the effectiveness of phonological reading. 

In conclusion, it is our view that the reading difficulties of dyslexic children could best be 
accounted for by a combination of memory deficits, deficient short-term storage having its 
principal effect on phonological reading and deficient semantic memory having its principal 
effect on ‘direct’ route reading. 

There is no reason to suppose that these two independent memory systems will be 
impaired to the same absolute or relative extent in all dyslexic children and indeed it is 
known from observations of adult patients with acquired dyslexia that each of these two 
modes of reading may be impaired independently of the other. It is therefore likely that 
developmental dyslexia is not a unitary disorder and that qualitatively different types of 
reading difficulty will occur in a population of retarded readers. 

There have been several attempts to subdivide the developmental dyslexic population. 
Some studies have reported evidence of heterogeneity in the dyslexic population with 
regard to the pattern of associated deficits (e.g. Naidoo, 1972; Nelson & Warrington, 1974) 
and some have reported heterogeneity on the basis of the quality of the reading and/or 
spelling difficulties experienced (e.g. Boder, 1973; Nelson & Warrington, 1974). On the 
basis of the hypothesis that both of two independent reading routes must function 
efficiently for the normal development of literacy skills we would argue that the pattern of 
difficulties found in Boder’s ‘dysphonetic’ dyslexia, or auditory dyslexia, can best be 
explained by maximal deficit in the phonological route of reading whilst those found in 
Boder’s ‘dyseidetic’ dyslexia, or visual dyslexia, can best be explained by maximal deficit in 
the ‘direct’ reading route. On the basis of the ‘dual route’ hypothesis one would expect the 
majority of severely affected children to have some combination of deficits in the two 
reading routes, and indeed mixed dysphonetic/dyseidetic patterns occurred most frequently 
in Boder's (1973) study. Thus the occurrence of different patterns of dyslexic difficulties is 
thoroughly consistent with the hypothesis that developmental dyslexia stems from deficits 
in independent memory systems, deficits which in turn limit the effectiveness of the two 
independent routes of reading. Not only can such a model explain the occurrence and 
distribution of qualitatively different types of developmental dyslexia in the general 
population but also it has the potential to describe and explain the difficulties experienced 
by the individual dyslexic child. 
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The aesthetics of simple figures 
I. C. McManus 





After reviewing the literature on experimental rectangle aesthetics (‘the golden section’) it was 
concluded that all of the effects demonstrated in the literature were dubious, either due to 
methodological limitations or inadvertent experimenter bias, these defects being compounded by 
most studies only considering population preferences and ignoring individual differences in 
preference. A series of experiments is described in which highly significant and temporally stable, but 
somewhat idiosyncratic, individual preferences were found. A taxonomy of these preferences, as well 
as of those for three separate types of triangle preference, is provided, based on factorial analysis. 
Two clear factors were demonstrated, one based on the square, and the other upon a proportion 
similar to that of the golden section. 





SOCRATES The beauty of figures which I am now trying to indicate is not what most people 
would understand as such, not the beauty of a living creature or a picture; what J mean...is 
something straight or round, and the surfaces and solids which a lathe, or a carpenter's rule and 
square produces from the straight and round... Things like that, I maintain, are beautiful not, 
like most things, in a relative sense; they are always beautiful in their very nature, and they carry 
pleasures peculiar to themselves... Philebus, 51.c 


The problem of proportion has seriously concerned aesthetics for two and a half millennia. 
At its simplest the problem reduces to the question, What is the most harmonious manner 
in which to arrange a small number of lines or other modular elements? Two major 
theoretical approaches can be found (further details of which may be found in the reviews 
by Arnheim, 1955; Wittkower, 1960; Panofsky, 1970; Zusne, 1970; Berlyne, 1971). 

The older approach emphasizes the integers and theif relations. It is shown par exellence 
in the Pythagorean analysis of musical intervals, where the most harmonious pairs were 
found to be in the proportions 1:2, 2:3, etc... . This system was codified by Vitruvius in 
his De Architectura of circa 30 B.c. The early Renaissance used the same aesthetic, it being 
clearly shown in Leone Battista Alberti's Ten Books on Architecture, published in 1485. 
This point is the historical watershed between two aesthetics; 24 years later Luca Pacioli in 
his De Divina Proportione, (1509) proposed that the fundamental proportion in aesthetics 
was the golden section. The fascinating mathematic properties of this geometric figure have 
been well described by Schooling (1914), Archibald (1920) and Huntley (1970). Pacioli's 
theory spread rapidly, influencing Dürer (Brion, 1960), Ramus, and Kepler (Sarton, 1951). 

Classical and Renaissance aesthetic theorizing had been essentially a priori and 
prescriptive in its approach to aesthetics. Plato's Timaeus had suggested that within the 
properties of numbers themselves could be found the essence of the universe, and aesthetics 
was seen as a branch of this numerical cosmogony. This attitude was modified after the 
Renaissance, so that the origin of the numbers themselves was partly empirical. Thus 
Alberti claimed to have measured the actual proportions of human bodies, and Dürer also 
carried out much research on human proportion. Even Burke (1757) accepted that such 
natural proportion might compel him to accept a system of numerical aesthetics. As a part 
of the 19th century German revival in aesthetics, Zeising (1854, 1855), and also Henzlmann 
(1860) suggested that because the golden section and other ratios could be found 
empirically within nature then the ratio must (and the argument is still prescriptive) be of 
aesthetic significance. 
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Gustav Fechner, in his Vorschule der Aesthetik (1876), is usually represented as 
attempting to remove the prescriptive part of the argument and to actually measure 
aesthetic preferences themselves. There are however reasons for believing that Fechner also 
had a strong prescriptive wish which modified his experiments. 

As a part of his new aesthetics ‘from below’ instead of in the traditional manner, ‘from 
above’, Fechner proposed three experimental methods. The most widely used is the 
‘method of choice’, in which subjects are allowed to choose between several alternatives, 
selecting that which they feel is the most beautiful, or the most pleasant. In his most 
famous experiment Fechner seated subjects in front of a series of 10 rectangles of various 
width-length ratios and simply asked them to choose the one which they liked best and the 
one which they liked the least (Fechner, 1876). The modal preference was indubitably for 
the golden section, although the spread around the section was wide and there was a hint, 
particularly in Lalo’s (1908) replication of the experiment, of a secondary peak at the 
square. The experiment has been enormously influential, being accepted by many 
non-scientists, and indeed by many psychologists, as incontrovertible scientific proof of the 
superiority of the golden section. It is therefore worth looking further at Fechner, and his 
motivations for studying this particular subject, as well as at some of his related research. 

From an early age Fechner had waged a long war against the growing materialism of the 
19th century, and this is partly manifested in his works on life after death, and on the 
mental life of plants (Fechner, 1835, 1848). His psychophysical researches were inspired 
when ‘lying in bed on the morning of the 22nd October, 1850, he saw the vision of a 
unified world of thought, spirit and matter, linked together by the mystery of numbers’ 
(Brett, 1921). His fascination with numencal aesthetics had been revealed earlier when he 
had published his thoughts on the form of angels and had concluded that they must be 
spherical, for the sphere was the most perfect of forms (Fechner, 1825; Boring, 1940). To 
such a man we may speculate that the mathematical properties of the golden section would 
represent a useful link between the harmony of nature and the world of the spirit. Whilst it 
is not possible to accuse Fechner of direct, nefarious alteration of his experimental results 
so that his data fitted with his pnor theories, we may speculate as to how much Fechner, 
consciously or subconsciously, produced experimental circumstances which would tend to 
give him his desired results. Godkewitsch (1974) and Piehl (1976) have shown that the 
method of rank ordering is very sensitive indeed to artifacts, both of experimenter 
expectancy and subject expectation, as well as to the range of stimuli presented, both 
midpoint and anchoring tendencies being found (although Benjafield, 1976, suggests that 
some of Godkewitsch's own results may themselves be artifactual). Fechner's subjects were 
not selected at random, and it is quite feasible, particularly given his rejection elsewhere of 
double-blind methods (Fechner, 1860; David, 1968, pp. 16-17), that his ‘cultured’ subjects 
were well aware of the intentions of the experimenter. As Godkewitsch put it: ‘In 
Fechner's study the subjects, asked to choose the most pleasing rectangle, often waited and 
wavered, rejecting one rectangle after another. Meanwhile the experimenter would explain 
that they should carefully pick a rectangle whose ratio between its sides could on the 
average be considered as most satisfying, harmonic and elegant'. 

In summary we cannot accept Fechner's experiment as adequate proof of a general 
population preference for the golden section. It is also noteworthy that Fechner carried out 
experiments on the aesthetics of ellipses and, having failed to find a preference for the 
golden section, did not publish his results, these being found posthumously in his 
unpublished papers (Witmer, 1894). 

As far as Fechner's other methods are concerned, he did not apparently carry out any 
experiments using the ‘method of production’, that is, asking subjects to draw or construct 
rectangles of the most pleasing proportions. 
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Fechner’s ‘method of application’ was to consider works of art or other artifacts, and to 
examine the proportions used in their construction. He himself found that the mean 
height-width ratio of paintings was removed from the golden section, although he did not 
give actual distributions, only means (Fechner, 1876). 

Fechner's principal experiment has been replicated several times. Studies by Lalo (1908), 
Thorndike (1917), Thompson (1946), Shipley et al. (1947), Nienstedt & Ross (1951), 
Eysenck & Tunstall (1968) and Berlyne (1971) have all used variants of the ‘method of 
choice', simultaneously presenting a series of rectangles to a subject and asking him to 
rank them in order of preference, and have obtained broadly similar results to those of 
Fechner. However the method still has methodological defects (Godkewitsch, 1974; Piehl, 
1976) and it must therefore be concluded that despite the replications the method is itself 
inadequate for the analysis of the problem of rectangle aesthetics. 

Haines & Davies (1904) asked subjects to look at a single stimulus at a time and to 
‘accept or reject it. They found large variations both within and between their small 
number of subjects. Hintz & Nelson (1970) used a method of ‘successive approximation’: 
this very dubious technique, which they also used in their 1971 study, would appear to be 
open to severe methodological criticism, not the least of which is that it makes no 
provision for subjects to have more than one preference. Amongst the ‘methods of choice’, 
only that of Weber (1930) seems to be devoid of serious methodological criticisms, since he 
used the method of paired comparisions, whereby a judgement 1s made separately for each 
pair of stimuli. Since there is good evidence that ranking tends to be a process of successive 
(but limited) paired comparisons (Russo & Rosen, 1975), this would seem to represent a 
solution to the problem of method, although even here we cannot be sure that range or 
anchor effects are not of significance. 

Piehl (1978) also used the method of paired comparison. However he used only seven 
stimuli (and hence only 21 pairs) and his results are difficult to interpret due to the massive 
size of the standard deviation relative to the differences between means. Piehl also fails to 
state whether his stimuli were ‘horizontal’ or ‘vertical’. 

Whilst all of the studies of rectangle preferences are open to objection it is perhaps 
worth pointing out that in some cases correlations with other factors have been found, 
which may not be entirely attributable to methodological artifact. Both Weber (1930) and 
Eysenck & Tunstall (1968) found a tendency on repeat testing for longer, thinner rectangles 
to be preferred. In addition Eysenck & Tunstall found a slight tendency for introverts to 
prefer longer, thinner figures. Young children (mean age 3-7 years) have no consistent 
group preferences (Thompson, 1946), and as they grow older their preferences grow 
increasingly like that of adults. In the old (61—91 years) there is a tendency to prefer 
squarer figures (Nienstedt & Ross, 1951). Preferences tend to be shown more clearly if 
figures of constant area rather than constant side length are used as stimuli (Shipley et al., 
1947). There is a small correlation between the shape of the visual field and the preferred 
rectangle shape (Hintz & Nelson, 1970). Certain groups tend to show preferences for 
squares: Berlyne (1970) found this tendency amongst most Japanese subjects, but only 
some Canadian subjects, and Hintz & Nelson (1971) suggest that haptic preference in 
congenitally blind subjects is for squares (as opposed to a haptic preference for golden 
sections in sighted subjects). 

Apart from rectangles there have been very few studies of other figures. Fechner himself 
looked at ellipses, and his results have been discussed earlier. Thorndike (1917) looked at 
triangles as well as crosses, and Lalo (1908) looked at crosses, and also dotted i figures (as 
had Fechner, see Witmer, 1894). Both sets of results are probably invalidated by the 
method of rank ordering. A notable exception to these methodological criticisms is the 
work of Austin & Sleight (1951) who looked at preferences for a series of isosceles triangles 
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by the method of paired comparisons and found a consistent population preference, 
although they also noted large individual differences and suggested that their preference 
curve primarily represented a curve of ‘least dislike’ (see Fig. 8). 

The only other work of relevance is that studying regular polygons (e.g. Eysenck, 1968; 
Eysenck & Castle, 1970). They used as stimuli the figures of Birkhoff (1933) and asked 
subjects to rate each of them on a seven-point scale of ‘Aesthetic pleasingness’. They factor 
analysed their results and found one factor of particular relevance to the present study, 
since it contained the square, a rectangle of ratio approximately 2:1, an equilateral 
triangle, an isosceles triangle, and a right-angled equilateral triangle. 

From this survey of the experimental literature we are forced to conclude that there is 
really very little adequate evidence for any meaningful consistent population preference for 
simple figures, the only possible exceptions being the work of Weber (1931) and Austin & 
Sleight (1951). This is not however to say that there are no such preferences, although 
Godkewitsch (1974) and Piehl (1976) have suggested this (and Piehl, 1978, has since 
reversed his earlier decision). One major objection to all of the earlier work is that no study 
has been made of individual as opposed to population preferences. Population preferences 
often conceal large underlying individual differences. Those few studies where the authors 
have stressed the importance of looking at individuals (e.g. Haines & Davies, 1904; 
Thorndike, 1917) have been constrained by the lack of any adequate statistical test which 
will allow them to make meaningful statements about the preference of a single subject. No 
such test is possible with the limited data of a rank-ordering technique, but it is possible 
with the method of paired comparisons. A further problem is that no one has looked at 
preferences for rectangles, and also for other simple figures, such as triangles, in the same 
subjects. Similarly, with the limited exception of Weber (1931), there has been no attempt 
to make a reasonably long-term follow-up of individuals to find out how stable their 
preferences are. The present study attempts to remedy some of these defects. 


A note on the description of stimuli 


A rectangle may be described in terms of the ratio of the horizontal to the vertical side (if 
the area is constant). The difficulty with using the simple ratio, horizontal length (width, H) 
divided by vertical length (height, V), is that rotation through 90 degrees produces a figure 
whose shape or form is the same, but whose ratio is now the reciprocal of H/V. A more 
serious difficulty is that figures of equal intervals on this ratio scale are not perceptually 
equidistant (thus the difference between rectangles of ratio 1-1 and 1-2, is not 
psychologically the same as that between rectangles of ratio 0-2 and 0-3). In this report 
therefore the logarithm to the base 10 of the ratio (H/V) will be used throughout. This has 
the advantage that if a figure is simply rotated through 90 degrees then one merely has to 
alter the sign of the log. ratio (e.g. on rotation a rectangle of log. ratio 0-30 becomes one of 
log. ratio —0-30): the intervals are also approximately perceptually equidistant. The 
description of the ratio of isosceles triangles (of two types A and B, see Fig. 1), and 
right-angled triangles (triangle type C) will be in terms of the enclosing rectangle. Thus a 
golden section tnangle is defined for these purposes as that drawn within a golden section 
rectangle (clearly this decision is in some sense arbitrary). The act of rotating a triangle of 
type A, through 90 degrees is to make it of series B, and to also alter the sign. 

The log. ratio of the golden section, (¢) 1-6180..., is 0-2089 and ratios of 2, 3 and 4 are 
respectively log. ratios 0-3010, 0-4771, and 0-6020. : 


Method 


Altogether, three separate experiments have been carried out (Expts 1, 2 and 3). All used similar 
methods but differed 1n the details of presentation and of the particular stimulus types and values. In 
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each case a complete paired-comparison design was used. Subjects were shown, for a particular 
stimulus type, all possible pairs of stimuli and were asked to make a single response for each pair, 
stating their relative preference on a six-point rating scale (strong, medium, weak preference for 
stimulus A, weak, medium, strong preference for stimulus B). Minimal instructions were given to the 
subjects, they simply being asked to record which stimulus they ‘preferred, or thought looked best’. 
No subject found this a strange task and they all settled down readily to the experiment proper after 
a brief practice run with about 10 pairs of stimuli. Subjects were asked to try as far as was possible 
to use all of the response categories. Most of the subjects were undergraduates at the University of 
Cambridge, none of whom was specializing in Fine Arts or Architecture, and very few of whom had 
ever heard of the golden section, or of its importance to aesthetics. Subjects were self-paced during 
the experiments, and they were encouraged to use immediate rather than considered impressions as 
Judgements. Most subjects looked at each pair of stimuli for from 5 to 15 seconds before making a 
decision. In Expt 1 subjects were tested individually and in Expts 2 and 3 in pairs. Experimental 
sessions usually lasted from 2 of an hour to 1j hours. 

In Expt 1, 23 subjects were shown a senes of ‘horizontal’ rectangles (i.e. rectangles of log. 
ratio > 0). Fifteen of these subjects also saw, usually in the same experimental session, but in a few 
cases on a separate occasion, a series of ‘vertical’ rectangles (i.e. of log. ratio < 0). A few subjects 
also saw, in separate experimental sessions, a few other series of stimuli (see Fig. 4). In Expt 2, 27 
subjects were shown a series of *mixed' rectangles (i.e. vertical and horizontal rectangles in the same 
series), then a series of upright isosceles triangles (triangles A, 'see Fig. 1), and then a series of 
isosceles triangles turned on their sides (triangles B, see Fig. 1). All subjects saw all stimulus types. 
Triangles A represented a replication of the experiment of Austin & Sleight (1951), except that a 
rating scale was used, and the stimuli were of slightly different proportions. 


(H/V) Rectangles Tnangles 


Ratio Log. ratio A B c 


-OE 


Figure 1. Definitions of ratios used for describing rectangles and triangles: see Results section for 
further description. 


20 0-30 





In Expt 3, 40 subjects saw a mixed series of rectangles, followed by a series of triangles A, and then 
triangles B, and finally triangles C (right-angled triangles — see Fig. 1). All subjects saw all four series 
of stimuli and, as in Expt 2, saw all stimulus types in the same order. 

The particular stimulus values in each experiment and series, and the number of stimulus pairs 
shown, are given in Table 1. 

The stimuli were shown by means of a pair of slide projectors and the members of each pair were 
shown side by side, the left-right positioning, as well as the overall order, being determined 
randomly, although the particular random order was the same for all subjects for any particular 
stimulus series. In Expt 1 the random order for horizontal rectangles was the same as that for vertical 
rectangles, the slides simply being rotated through 90 degrees in each projector. The stimuli consisted 
of solid white figures, all of equal area, projected against a dark background. 
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Table 1. The stimulus values used for each stimulus type in each experiment (see Fig. 1 for 
definitions of stimulus types). Values are expressed as the log. ratio x 100 








Experiment Stimulus type No. of pairs Stimulus values 
1 Horizontal rectangles 105 0, 3, 6, 9, 12, 15, 18, 21, 24, 27, 
30, 37-5, 45, 52:5, 60 
I Vertical rectangles 105 0, —3, —6, —9, —12, — 15, —18, 
—21, —24, —27, —30, —37 5, —45, 
—52:5, —60 , 
2 Rectangles 105 —52:5, —37 5, —27, —21, —15, —9, 
— 3, 0, 6, 12, 18, 24, 30, 45, 60 
3 Rectangles 105 —60, —45, —30, —24, —18, —12, 
— 6, 0, 6, 12, 18, 24, 30, 45, 60% 
2 Triangles A 66 —47, —44, —40, —35, —30, —24, 
— 18, — 10, 0, 12:5, 30, 60 
3 Triangles A 45 —60, —45, —30, —22:5, —15, 
— 1-5, 0, 15, 30, 60 
2 Triangles B 66 — 60, —30, — 12:5, 0, 10, 18, 24, 
30, 35, 40, 44, 47 
3 Triangles B 45 — 60, —30, — 15, 0, 7-5, 15, 22:5, 
30, 45, 60 
3 Triangles C 55 —45, —30, —22:5, —15, —7-5, 0, 


7:5, 15, 22-5, 30, 45 


A small number of subjects took part in the experiment on several occasions and their results have 
been included in each separate experiment in which they took part. Five of the subjects of Expt 1 
took part in Expt 2, and four of these same subjects also took part in Expt 3; one subject in Expt 2, 
who had not taken part in Expt 1, also took part in Expt 3. 


Analysis of results 
Rectangles 


Data were analysed by giving 5 points for a strong preference for a stimulus, 4 for a 
medium and 3 for a weak preference, and 2, 1, ana’0 for a weak, medium or strong dislike 
respectively. Each single pair-comparison judgement therefore gave two numbers, a relative 
like for one stimulus, and a relative dislike for the other stimulus. These values were 
entered into an n x n matrix (the portion below the leading diagonal being the complement 
of the portion above). The leading diagonal was filled with zeros. Relative preferences for 
each stimulus were computed by taking the edge totals and then standardizing them by 
compressing them so that the maximum possible score was + 1:0, and the minimum 
possible score was — 1:0. This process was carried out both for individual subjects and also 
for groups of subjects, individual preference matrices having their respective cells 
summated. 

Figure 2 shows the group preference for rectangles, the four different sets of stimulus 
values being plotted separately. Note that the ordinate is relatively expanded with respect 
to its possible range. Preference values may be compared within series, although not 
between series or between experiments. Several features are apparent in all of the curves. 
There is a dislike for the extreme ends of the spectrum, although this is stimulus-dependent 
and not range-dependent (compare Expt 1 with Expts 2 and 3). There is a preference for 
values around the golden section (¢) although in several cases the exact maxima are fairly 
discrepant from ¢ itself. The curves are also symmetric about log. ratio 0. In all 
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Figure 2. Population preferences for rectangles, results being plotted separately for the two parts of 
Expt 1, and for Expts 2 and 3. The vertical lines through the figure represent the positions of the 
square and the golden section. O-O, Expt 1H; B-E, Expt 1V; ©. - : 6, Expt 2; A--- A, Expt 3. 


experiments there is some evidence for a preference at about log. ratios of 0, i.e. the square, 
and in Expt 1 in particular this preference seems to be slightly shifted towards the right, a 
feature which might indicate the existence of the horizontal-vertical illusion, which has been 
shown to occur in solid rectangles (McManus, 1978). There is also perhaps a hint that the 
preferences around the golden section, both +ø and — 9, are also shifted slightly to the 
right as well, although this conclusion is far from certain. The final point to note is the 
relatively small size of the population preference functions, the range being +0-2 to — 0-2, 
in a possible range of 4- 1-0 to — 1-0. 

In view of the relatively small effects it is desirable to have some form of statistical 
analysis against a null hypothesis. The null hypothesis is, of course, that the individuals in 
the experiment are simply responding at chance levels. To carry out a statistical analysis 
the individual preference matrices were converted into binary matrices, 0, 1, and 2 being 
collapsed into a value of 0, and 3, 4, and 5 being collapsed to a value of 1. These data were 
then analysed, either individually or as a group, by the methods described by David (1968). 
Two methods may be used. In the simplest (David, 1968, p. 38) the edge totals of the 
binary matrix may be tested for homogeneity. This has two limitations: the edge totals 
might be homogenous despite a significant micro-structure within the data matrix itself, 
and also the analysis takes no account of any possible trends along the edge, the particular 
order of the stimulus values not being taken into account. The second method (David, 
1968, p. 25) based on that of Kendall & Babington Smith (1940) is more sensitive, taking 
account of the micro-structure of the data matrix itself. Consider three stimuli, P, Q, and 
R. Let a subject prefer (p)P to Q, and Q to R (i.e. P p Q, and Q p R). If he has consistent 
preferences we might expect that also P p R, whilst if he is merely responding at chance 
levels there should be an equal likelihood of R p P. Triads of the form PpQ, Qp R, PpR 
may be described as logical, transitive or consistent triads, whilst those of the form P p Q, 
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Q pR, R pP may be described as illogical, intransitive or circular triads. The number of 
circular triads within a data set is a sensitive index of the degree of consistency of the 
responses. The method does not however take note of possible trends in the data due to 
ordering of the stimulus values, and thus is still an inherently conservative test. The 
application of a test which takes into account the ordered nature of the stimuli, such as 
that of Jonckheere (1954), is also unsatisfactory since there are no a priori orders which are 
intuitively reasonable. Hence the data may be only tested for trend a posteriori and this is 
statistically unsatisfactory. 

Using the method of circular triads for the combined horizontal rectangle preferences of 
Expt 1, the population value of U is 0-073 (possible range = 1-0 to — 0-043) (the maximum 
value of U is of course 1, when all subjects agree completely, whilst the minimum value 
cannot be —1-0, for complete disagreement between subjects is not logically possible, and 
the particular minimum must be calculated for the particular preference matrix). For the 
horizontal rectangles the value of U is significantly different from chance (x? = 304-5, 

d.f. = 120, P < 0-001). For the vertical rectangles of Expt 1 there is no significant degree of 
inter-subject agreement by this test (U = — 0-024, range of U = 1-0 to —0-067, x? = 90-1, 
d.f. = 130, n.s.). The results for Expt 2 are just significantly different from chance 

(U = 0:009, possible range of U = 1-0 to — 0-036, y? = 145-4, d.f. = 117, P < 0:05), whilst 
those for Expt 3 are significant (U = 0-049, possible range of U = 1-0 to —0-025, 

X? = 326-0, d.f. = 113, P < 0-001). In summary it would seem that there probably are 
population preferences for rectangles, but that these effects are small. 

A small population effect may be due either to an overall weak preference, or might be 
due to a strong preference within each subject, with these preferences being sufficiently 
different to cancel one another out when summed. 

Figure 3 shows results for six individual subjects. All except subject 42 are significantly 
different from chance (P < 0-001) by the method of circular triads described above. It is 
clear from these individual preference functions that there is a wide range, and that the 
individual effects are of far greater magnitude than the population effect, preferences of 
close to +1 and —1 being reached in several subjects. Clearly the majority of the 
population preference function is a result of unjustified addition of qualitatively unlike 
individual functions. 

The majority of individual preference functions are significantly different from chance. 
Table 2 shows the number of circular triads found in the three experiments. Clearly it is 
not possible for all triads to be circular. Kendall & Babington Smith (1940) demonstrated 
that for a 15 x 15 paired comparison matrix one can obtain a maximum of 140 circular 
triads (out of a total of 455 triads) and that chance alone would produce a modal value of 
120 triads. A total of less than 96 triads is significant at the 5 per cent level, less than 81 at 
the I per cent level, and less than 72 at the 0-1 per cent level. From Table 2 it is clear that 
for all three experiments the majority of subjects are producing significantly non-random 
preference matrices. 

The analysis of circular triads, as described above, has taken no account of the strength 
of the subjects’ preferences, being based on the binary preference matrix. However we 
might reasonably expect that if subjects produce circular triads as a result of genuine 
errors, then when they do so they should make weaker judgements than when they are 
making non-circular triads. Let us give 1 point for a weak preference (in either direction), 2 
for a medium preference, and 3 for a strong preference. Of course any single preference 
judgement is a part of a large number of triads, both circular and non-circular: 
nevertheless we may still calculate the average response strength in each of the triad types, 
remembering that such a test will not be maximally sensitive. If we consider, for each 
subject individually, the ratio of the strength in non-circular triads to that in circular triads 
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Figure 3. Individual rectangle preference functions for six subjects, subjects 7, 42 and 50 being from 
Expt 2, and subjects 73, 88 and 92 from Expt 3. All the preference functions, except that of subject 
42, are significantly different from chance with P < 0-001. Subject 42’s rectangle preferences are 
indistinguishable from chance: however she also carried out triangle preference experiments and 
produced highly significant results, a finding which occurred in several other subjects. The examples 
have been chosen for their range rather than in proportion to their actual rate of occurrence. 


Table 2. The number of subjects producing various numbers of circular triads in the 
rectangle experiments 














n triads 0271 7280 8196 97—140 

sig: P < 0:001 0.001 < P «0-01 0-01 < P «0-05 n.s. n 
Expt 1H 16 1 2 4 23 
Expt 1V 12 0 2 1 15 
Expt 2 19 0 3 5 27 
Expt 3 30 1 6 3 40 
Total 71 2 13 13 105 
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(x), then if the above hypothesis is correct then the value of x should be greater than 1. 
Overail of 104 subjects viewing rectangles, 82 had values of x > 1, and only 22 had values 
< 1 (x? = 33-47, df. = 1, P < 0-001). For an individual subject the significance of the 
difference between triad types may be found by comparing the frequencies of the three 
response categories in a 3 x 2 contingency table. Table 3 shows that overall 48 subjects 
(46-1 per cent) had significantly high values of x as compared with only 4 (3-8 per cent) 
subjects with significantly low values of x. 


Table 3. Strength of preference judgements used in circular and non-circular triads, in 
rectangle experiments. Let x — (non-circular score)/(circular score). One subject produced 
no circular triads at all and has been excluded from this analysis 








x«l x«l x»1 x>l x»1 
Experiment P « 0:05 n.s. n.s. 0-05 > P > 0-001 P < 0-001 n 
1H 0 6 1 6 10 23 
1V 2 2 5 4 I 14 
2 1 5 12 3 6 27 
3 1 5 16 10 8 40 
Total 4 18 34 23 25 104 


It thus seems that the preference functions of individual subjects are statistically highly 
sıgnificant. These functions are however only of real interest if they can be shown to be 
stable with respect to time. Experiments 1 and 3 of this study were carried out at an 
interval of 2} years. Four subjects took part in both experiments, and their preference 
functions are shown in Fig. 4. Note that these four subjects did not use, in this particular 
part of Expt 1, the stimulus values reported in Table 1, but a ‘mixed’ series, which 
provides greater compatibility with Expt 3. It is clear, given the range of individual 
preferences shown in Fig. 3, that these four preference functions show a reasonable 
temporal stability. 


Triangles 


The analysıs of the triangle preference functions is essentially similar to that of the 
rectangles described in the previous section. Figure 5 shows the preference functions for 
each triangle type. All of the functions are significantly different from chance by the 
circular triad method (David, 1968, p. 25); see Table 4 for statistical analysis. 

Examination of the preference functions of Fig. 5 shows several features of note. For 
type A triangles the results compare very closely with those of Austin & Sleight (1951); 
preference functions thus seem to be stable across 23 years and two continents. For all of 
the triangle types the preference function seems to be unimodal (unlike the case of the 
rectangles), but like the rectangle preference functions, the magnitude of the population 
preferences is small in comparison with their possible range of +1 to — 1. For triangles of 
type A and B the golden section seems to be of some importance, but interestingly only at 
—¢ for type A and +¢ for type B: the implication is that it is the form of these triangles 
which is important, rather than the shape of their enclosing rectangle. The function for 
triangles C is fairly symmetric, but is so flat-topped that it is difficult to know exactly 
where the maximum is located, or even whether the curve is unimodal. 

Analysis of the circular triads from individual subjects reveals that the majority of 
individual subjects have highly significant preference functions and that, as in rectangle 
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Rectangles 


Preference score 





"60 -30 0 30 60-60 -30 0 30 60 
100 . log (ratio) 


Figure 4. Test—retest results for rectangle preferences for four subjects. The preference functions 
indicated by the solid circles were measured first, whilst those with open diamonds were measured about 
2 years later. The retest experiment used the rectangles already described for Expt 3. The test results 
however were obtained at the same time as Expt 1 was carried out but used the stimuli described for 
Expt 2; as the experiment was still at a somewhat early stage at that time, subject 6 actually used the 
inverse of the Expt 2 rectangles (i.e. rotated 90 degrees). These four subjects also carried out Expt 1 
proper and their results obtained with those stimuli are included ın the results of Expt 1, as described 
elsewhere in this paper. 


preferences, there is a wide range of individual difference in preference functions: 
considerations of space preclude the description of individual variation for all triangles 
elsewhere in this paper. @——@, test; O— O, retest. 


Individual differences in preference functions 


The small population preference for rectangles (Fig. 2) contrasts strongly with the far 
greater magnitude of individual preference functions (Figs 3, 4), and implies that 
inter-subject differences are larger than inter-subject similarities. Visual scrutiny of the 
individual preference functions did not suggest any obvious taxonomy for these variations, 
and a multivariate statistical technique was therefore used to identify the underlying 
structure. 

Consider rectangles in Expt 3. Each subject made preference judgements on 105 pairs of 
stimuli. The preferences for each pair of subjects were correlated by a pairwise comparison 
of the judgements of each subject. These correlation coefficients were therefore independent 
of the ordered nature of the 15 stimulus values. The 40 x 40 correlation matrix thus 
produced was then factor analysed (by means of the FACTOR program of the SPSS 
statistical package, Nie et al., 1975), the first eight factors being extracted, and then a oe 
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Triangles 


Preference score 


—20 
100 . log (ratio) 





20 


Figure 5, Population preference functions for Expts 2 and 3 for three separate types of triangles. 


©- - 9, Expt 2; A--- A, Expt 3; @—@, Austin & Sleight (1951). 


Table 4. Analysis of circular triads for population triangle preferences shown in Fig. 6: 
method is that of David (1968, p. 25) 





Possible U range 





Stimulus type Experiment U Max Min xi d.f. P 

Triangles A 2 0:058 1 — 0-038 181-9 74 < 0-001 
Triangles A 3 0.077 1 —0-026 191-6 48 < 0-001 
Triangles B 2 0-047 .1 —0-038 161-7 74 « 0-001 
Triangles B 3 0:065 1 — 0-026 168-3 48 < 0-001 
Triangles C 3 0-059 1 —0-025 192-4 59 < 0-001 
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varimax rotation used, producing orthogonal factors. To analyse the nature of these 
factors the factor loadings of each subject on a particular factor were multiplied by the 
subject’s preference matrix, and the resultant weighted matrices then summed and the edge 
totals taken. These edge totals were then standardized so that the absolute total was equal 
to 2. These standardized edge totals were then plotted against rectangle shape, and the 
results inspected. A similar process was used for the other stimulus types in each of the 
experiments. 

It is clearly of interest to know whether when a subject chooses a particular type of 
rectangle he will also choose a particular type of triangle A, triangle B, etc. To examine 
this the factor loadings of each subject of each of the eight factors of the four stimulus 
types in Expt 3 were intercorrelated across subjects, so that the resultant 32 x 32 matrix 
contained the intercorrelations between factors, e.g. between the third factor on triangles A 
and the fourth factor on triangles B, etc. This correlation matrix was then factor analysed 





—-60  -40  -20 0 20 40 60 
100 log (ratio) 


Figure 6. Factor alpha for the four stimulus types, for all experiments. The ordinate is arbitrary, 
being constructed such that the total absolute deviation of all the points from the abscissa zero 
should be +2. —, Expt 1; ---, Expt 2; ——-, Expt 3. 
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Figure 7. Factor beta: otherwise as for Fig. 6. —, Expt 1; «++, Expt 2; ———, Expt 3. 


using the same program, and the first eight factors extracted, and then rotated by a 
varimax rotation. From this analysis it became readily apparent which of the main factors 
of the individual stimulus type factor analyses were interrelated. Whilst these relationships 
were usually clear the process was not always unambiguous, particularly when the process 
was repeated for Expt 2, with its more limited numbers of subjects. 

To clarify the interrelationship between stimulus types the individual subjects' data from 
Expt 3 were again intercorrelated but this time not just for one stimulus type at a time, but 
for all four stimulus types. Each correlation was therefore based upon 250 pairs of 
judgements. This 40 x 40 inter-subject correlation matrix was factor analysed and rotated 
to varimax orthogonality. From the loadings of each subject on the orthogonal factors the 
underlying preference functions were determined for each stimulus type separately. A 
similar process was carried out for Expt 2. The eigen values for the first 10 factors from the 
analysis of Expt 3 are 11-48, 4-08, 3-08, 1-81, 1-49, 1-30, 1-13, 1-09, 0-97, 0-94. A 
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Figure 8. Factor gamma for Expt 3 only: otherwise as for Fig. 6. 


*Scree-slope analysis' (Child, 1970) suggests that the first two factors are highly significant 
and the next two possibly so. The rest of the factors are probably too small, even if real, to 
be of any interest. It is possible of course that with a larger sample further factors would 
appear. The identification of similar factors in the three experiments was carried out by 
inspection. The first two factors of Expt 3 are readily identifiable in Expts | and 2, but the 
identification of the third and fourth factors is not clear and these have been omitted from 
this paper. 

The first factor identified in Expt 3 has been called factor alpha, and is shown in Fig. 6. 
In its positive form it is a preference for squares and their triangular derivations, and a 
dislike for all other figures. In Expt 1H there is a suggestion of a horizontal-vertical 
illusion, although its magnitude is rather large. The triangle preferences are all broadly 
identical with each other. 

Factor beta is the second factor isolated from Expt 3, and is shown in Fig. 7. It is rather 
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Figure 9. Factor delta for Expt 3 only: otherwise as for Fig. 6. 


more interesting than factor alpha, being bimodal in the case of rectangles and right-angled 
triangles, and also showing a strong suggestion of being related to the golden section, both 
positive (+¢) and negative (—¢). Triangles A and B show similar curves, but differ from 
the other two stimulus types in being unimodal. They are also out of phase, triangles B 
being the mirror-rotation of triangles A around the ‘square’. As noted earlier with the 
preference for triangles A and B (Fig. 5) the implication is that it is the form of the 
triangles which matters and not the orientation of the enclosing rectangle. 

Table 5 shows the loading of each subject of Expt 3 on the first four factors. Twenty-two 
(55 per cent) of the subjects have a significant (> 0-3 or < —0-3) (Child, 1970) loading on 
the alpha factor, and 18 (45 per cent) have a significant loading on the beta factor. Only 11 
subjects (27-5 per cent) have non-significant loadings on both factors. Eleven (27-5 per 
cent) subjects have significant loadings on both factors, the majority (8) having a positive 
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loading on both factors. It is important to note that seven subjects have negative loadings 
on factor alpha, and three subjects have negative loadings on factor beta: the preference 
functions for these subjects are therefore the inverse of those shown in Figs 6 and 7. It is 
these negative loadings which account for the relative flatness of the preference functions of 
Figs 2 and 5. 

The last two factors (factor gamma and factor delta) are composed mainly of 
non-significant loadings with just a few subjects with significant loadings. It is difficult to 
interpret these factors with certainty. For interest and completeness they are shown in Figs 
8 and 9, but their identification should be regarded as only tentative. Factor gamma is of 
interest in that it is almost identical to factor alpha except for the inversion of the triangle 
preference functions: the status of this finding is not at all clear, but it accounts for the 
partly ambiguous results obtained when the stimulus types were factored independently 
and then the loadings refactored, as described earlier. Factor delta is of particular interest 
for it is asymmetric around log. ratio zero, and approximates, in the case of the rectangle, 
to a unimodal golden section curve. Clearly it is necessary to have asymmetric preference 
functions of this type in order to account for the individual preference functions of the type 
shown by subjects 73, and possibly 50, in Fig. 3. 


Conclusions 


After an historical and experimental review it was concluded that the golden section 
phenomenon, particularly as delineated by Fechner, was probably unreliable and mainly 
artifactual. The paired-comparison technique is probably free from the artifacts of ranking 
methods; nevertheless with four different series of stimulus values, consistent preference 
functions were obtained for rectangles, these preferences being stable in several subjects 
over a period of 2} years. However population preferences were small in comparison with 
individual variation. After a moderate degree of statistical manipulation using multivariate 
analysis it was possible to produce an objective taxonomy of these individual variations 
and to produce at least two major factors which are readily interpretable and probably 
reliable, and also two other factors of probable significance and of some interest. It is 
presumed that the wide range of particular subject preference functions is a function of 
differential admixture of these several types of preference function. The other simple 
figures studied, three types of triangle, all bear a simple relation to the rectangle functions 
obtained. 

It has been assumed throughout this study that the responses of subjects in this study 
truly represent ‘aesthetic’ responses: this may however be, at best, a tenuous assumption. 
The wide range of subject preference functions makes one speculate whether one is really 
dealing with some form of experimenter demand effect. Presumably since a subject has 
agreed to spend an hour doing the experiment he feels obliged to actually do something: 
this doing need not however represent aesthetic behaviour. Against this hypothesis are two 
items of evidence. Firstly the subjects claim that they are making aesthetic judgements, and 
that they feel the request to make such judgements is a reasonable one. Secondly, in those 
few subjects who have been retested over a 2-year period, there is evidence of a high degree 
of replicability (and also the subjects themselves claimed not to be able to remember their 
previous judgements). 

It is perhaps reasonable therefore to assume that the results in this type of experiment 
represent some form of elemental aesthetic judgement of the type postulated by Socrates. If 
so it would be interesting to know why subjects differ, how constant their preference 
functions are over longer periods of time, and whether their particular preference functions 
correlate with other variables (shape of eye-field, personality variables, etc.). Also, does the 
degree of loading on particular factors vary within subjects, or correlate with other 
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variables. All of these items of information could give some clues as to the origin of the 
phenomenon, and its nature. The author has examined the data already presented in this 
paper in terms of the sex of the subject, and their particular area of study (arts vs. sciences) 
and has been able to find no significant links with the factors. 

It would thus appear (admittedly somewhat to this author’s surprise) that there 1s 
moderately good evidence for the phenomenon which Fechner championed, even though 
Fechner’s own method for its demonstration is, at best, highly suspect owing to 
methodological artifacts. Whether the golden section per se is important, as opposed to 


similar ratios (e.g. 1-5, 1-6, or even 1-75), is very unclear, techniques at present not being 
accurate enough to make adequate measurements. 


Acknowledgements 


I am grateful to Dr Nick Humphrey for his encouragement of this study, particularly for his 
supervision of Expt 1, which was carried out as a Part II project for the Natural Sciences Tripos of 
the University of Cambridge. Experiment 3 was funded by a grant from the Durham Fund of King’s 
College, Cambridge, to whom I am very grateful, as also to Professor O. L. Zangwill and Dr 

O. J. Braddick for providing experimental facilities for that experiment. 


References 


ALBERTI, L. B. (1485). De re aedificatoria. Florence 
Translated by J. Leoni as, The Ten Books of 
Architecture. London: Tiranti, 1955 

ARCHIBALD, R. C. (1920). Notes on the Logarithmic 
Spiral, Golden Section, and the Fibonnaci series 
In J Hambidge (ed.), Dynamic Symmetry: The Greek 
Vase. Appendix, pp. 146-157 New Haven, Conn.. 
Yale University Press. 

ARNHEIM, R. (1955). A review of proportion Journal of 
Aesthetics and Art Criticism, 14, 44-57. 

AUSTIN, J. R. & SLEIGHT, R B. (1951) Aesthetic 
preference for isosceles triangles. Journal of Applied 
Psychology, 35, 430-431. 

BENJAFIELD, J (1976). The ‘Golden Rectangle’: Some 
new data. American Journal of Psychology, 89, 
737-743. 

BERLYNE, D E (1970). The Golden Section and 
hedonic judgements of rectangles A cross cultural 
study. Sciences de l' Art, 7, 1—6. 

BenLYNE, D E (1971). Aesthetics and Psychobiology 
New York: Appleton-Century-Croft. 

BinKHorr, G. (1933) Aesthetic Measure. Cambridge, 
Mass.. Harvard University Press 

Boring, E. G. (1940). History of Experimental 
Psychology, p. 279. New York: Appleton-Century- 
Croft. 

Brett, G. S (1921) A History of Psychology, vol III, 
p 128. London: George Allen & Unwin 

Brion, M. (1960) Albrecht Dürer. His Life and Work. 
London. Thames & Hudson 

BURKE, E (1757) A Philosophical Enquiry mto the 
Origin of our Ideas of the Sublime and Beautiful. 
London: R. & J. Dodsley. 

Cup, D (1970). The Essentials of Factor Analysis. 
London: Holt, Rinehart & Winston 

Davi, H. A. (1968) The Method of Pared 
Comparisons. London. Griffin. 

EvsENCK, H. J. (1968). An experimental study of 
aesthetic preferences for polygonal figures Journal of 
General Psychology, 79, 3-17. 

Eysenck, H. J. & CASTLE, M. (1970). Training ın art as 


a factor in the determination of preference 
Judgements for polygons. British Journal of 
Psychology, 61, 65-81. 

EvysENCK, H. J. & TUNSTALL, O. (1968). La Personnalité 
et l'esthétique des formes simples. Sciences de I’ Art 
5, 3-9. 

FECHNER, G. T. (1825). Vergleichende Anatomie der 
Engel Leipzig 

FECHNER, G. T (1835) Das Buchlein vom Leben nach 
dem Tode. Leipzig. Translated by H. Wernekke as, 
On Life after Death, Chicago, 1906. 

FECHNER, G. T. (1848). Nanna, oder über das 
Seelenleben der Pflanzen. Leipzig Leopold Bok. (See 
Fechner, 1946, for a partial translation.) 

FEgcHNER, G. T. (1860). Elemente der Psychophysik. 
Leipzig: Breitkopf & Hartel. Translated by 
H E. Adler as, The Elements of Psychophysics New 
York. Holt, Rinehart & Winston, 1966. 

FECHNER, G T. (1876). Vorschule der Aesthetik. 
Leipzig. Breitkopf & Hartel. 

Fecuner, G. T. (1946) Religion of a Scientist. London: 
Kegan Paul, Trench & Trubner. 

GoDKEWITSCH, M. (1974). The Golden Section: An 
artefact of stimulus range and measure of preference 
American Journal of Psychology, 87, 269—277 

Haines, T. H. & Davies, A. E. (1904) The psychology 
of aesthetic reaction to rectangular forms. 
Psychological Review, 11, 249-281. 

HENZLMANN, E. (1860) Théories des Proportions. Paris: 
A. Bertrand 

Hintz, J.M & Netson, T M (1970) Golden Section: 
Reassessment of the perimetric hypothesis. American 
Journal of Psychology, 83, 126-129. 

Hintz, J. M. & Neison, T M (1971) Haptic aesthetic 
value of the Golden Section. British Journal of 
Psychology, 62, 217-223. 

HuNrTLEY, H. E. (1970). The Divine Proportion. New 
York: Dover Books. 

JONCKHEERE, A R. (1954). A distribution-free k-sample 
test apainst ordered alternatives. Biometrika, 41, 
133-145 


524 I. C. McManus 


KENDALL, M. G. & BABINGTON SmITH, B. (1940). On 
the method of paired comparisons. Biometrika, 31, 
324-345 

Lato, C. (1908) L'Esthétigue expérimentale et 
contemporaine. Paris: Alcan 

McManus, I. C. (1978). The horizontal-vertical 
illusion and the square. British Journal of 
Psychology, 69, 369—370. 

Ne, N. H. et al (1975). Statistical Package for the 
Social Sciences, 2nd ed. New York. McGraw-Hill. 

NiENSTEDT, C. W. & Ross, S (1951). Preferences for 
rectangular proportions in college students and the 
aged. Journal of Genetic Psychology, 78, 153-158 

PaciorLt, L (1509). De Divina Proportione Venice: 
Paganinos de Paganinis de Brixia. 

PANOFSKY, E. (1970). The history of the theory of 
human proportions as a reflection of the history of 
styles. In, Meaning in the Visual Arts 
Harmondsworth, Middx: Penguin. 

PEHL, J. (1976). The ‘Golden Section’: An artefact of 
stimulus range and demand characterist.cs 
Perceptual and Motor Skills, 43, 47-50. 

PEHL, J. (1978). The Golden Section. The ‘true’ 
ratio? Perceptual and Motor Skills, 46, 831-834 

Russo, J. E. & Rosen, L. D. (1975). An eye-fixation 
analysis of multi-alternative choice Memory and 
Cognition, 3, 267-276 

Sarton, G. (1951). Isis, 42, 47. 

SCHOOLING, W (1914). The ¢ progression. In 


T. A. Cook (ed.), The Curves of Life. Appendix, pp. 
441-447. London: Constable. 

Surp.ey, W. C., BATTMAN, P. E. & STEELE, B. A. 
(1947). The influence of size upon preferences for 
rectangular proportions in children and adults 
Journal of Experimental Psychology, 37, 333-336. 

THOMPSON, G G. (1946). The effect of chronological 
age on aesthetic preferences for rectangles of 
different proportions Journal of Experimental 
Psychology, 36, 50-58. 

THORNDIKE, E. L (1917). Individual differences in 
Judgements of the beauty of simple forms 
Psychological Review, 24, 147-153. 

Vitruvius PoLuo (1931). On Architecture. Translated 
by F Granger. London: Loeb Classica] Library 

WEBER, C. O. (1931). The aesthetics of rectangles and 
theories of affection Journal of Applied Psychology, 
15, 310-318 

WiTMER, L. (1894). Zur experimentallen Aesthetik 
einfacher raumlicher Formverhaltnisse. 
Philosophische Studien, 9, 96-144 

WITTKOWER, R. (1960) The changing concept of 
proportion. Daedalus, 89, 199-215. 

ZEISING, A. (1854). Neue Lehre von den Proportionen 
des menschlischen Kórpers Leipzig Weigel. 

ZEISING, A. (1855). Aesthetische Forschungen. Frankfurt 
am Main. 

ZUSNE, L. (1970). Visual Perception of Form New 
York: Academic Press. 


Received 14 July 1978, revised version received 7 June 1979 


Requests for reprints should be addressed to Dr I C McManus, Department of Psychology, Bedford College, 


Regent's Park, London NW1 4NS. 


British Journal of Psychology (1980), 71,525—539 Printed m Great Britain 525 


Is the immediate memory span determined by subvocalization rate? 


Lionel Standing, Barbara Bond, Philip Smith and Catherine Isely 





The immediate memory span for eight types of material was measured, using 24 subjects, and various 
possible predictor variables were recorded. Span size for individual subjects was most highly 
correlated (r — —0-75) with their time-scores on a subvocalization task; 1t appeared essentially 
unrelated to subject variables such as age, IQ, echoic memory and longer-term memory. Span sizes 
for the different materials were also highly correlated with the mean subvocalization times found for 
these stimuli (r = —0-90), although the information content of the materials also appeared influential. 
Together, subvocalization rate and information content accounted for 98 per cent of the observed 
variance in span size across materials. A second experiment indicated that span size and 
subvocalization time were both unaffected by alcohol, wnile a third study indicated that span was 
decreased, and subvocalization time increased proportionately, by testing in a bilingual subject's 
second language The data of all three experiments suggest that the span for a given type of material 
corresponds approximately to the number of items that the subject can subvocalize silently or in 
whispers in a fixed time interval (about 1-8 s or 2:2 s respectively, although these values are not 
constant across experiments). The results are interpreted as being consistent with a verbal-loop 
process in short-term memory, such that more items are maintained in storage if they can be 
circulated rapidly through this loop. 





The present study is concerned with the immediate memory span, or number of unrelated 
stimulus items that a subject can report perfectly, in sequence, after a single presentation. 
Its major aim is to determine whether the span may be predicted accurately from tbe 
subject's performance in other cognitive tasks, and to identify those tasks. 

The memory span is a behavioural measure showing unusual consistency across different 
subjects and test materials. An informal experiment by the first author, using 58 
psychology students as subjects, yielded a mean span and standard deviation (between 
subjects) of 7:38 and 0-69 items for auditory letter stimuli; with digit stimuli the 
corresponding values were 7-93 and 0-95 items. Nevertheless, small but reliable effects upon 
the span have often been noted from both stimulus and subject variables (Blankenship, 
1938; Brener, 1940). The present study examines both these factors and attempts to predict 
the magnitude of the span for individual subjects and for various classes of stimulus 
material. 

Ideally, the span would be predictable for a detailed knowledge of its component 
mechanisms. Lacking this, we may nevertheless identify several likely determinants of the 
span's magnitude, on the basis of previous studies. Of primary importance, various 
investigations suggest that some form of internal, speech-like process may function as a 
holding mechanism in immediate memory tasks (Conrad, 1964; Baddeley, 1966; Sperling, 
1967; Murray, 1968). Therefore we may expect that the more rapidly a given subject can 
repeat stimulus items to himself, to rehearse them until recall 1s possible, the greater should 
be his span. Or, if one type of stimulus can be subvocalized more rapidly than another, the 
span should again increase correspondingly. This hypothesis gains some support from the 
study of Glanzer & Clark (1963), although this dealt with the length of coded information 
in the presumed ‘verbal loop’ rehearsal mechanism rather than the speed of operating this 
mechanism. Likewise, Baddeley et ai. (1975) have demonstrated an inverse relationship 
between articulation rate (varied by changes in word length) and immediate recall. 
However, Lyon (1975) in two short experiments failed to find a significant effect of 
rehearsal speed on the digit span. The present study continues this inquiry, in examining 
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whether a significant positive correlation exists between the span and subvocalization rates 
for different subjects and materials, which we postulate may be taken as indices of 
attainable rehearsal rates. Both overt and covert measures of subvocalization are examined 
in the present study, since previous research does not identify which of these is the more 
significant behavioural measure. 

The above view of the span implies that the effective duration of rehearsed material must 
also be considered: presumably if a rehearsed item lasts longer 1n a verbal-loop holding 
mechanism before it has decayed to threshold strength, this will give the subject longer in 
which to rehearse other items before it becomes necessary to recycle the first item. This 
should increase the span. This study therefore attempts to assess the effective duration of 
auditory stimulus items, for different subjects, by means of two tests. On the tentative 
hypothesis that an internally generated item in the verbal loop decays similarly to an 
externally presented auditory stimulus, we attempt to assess decay times by means of two 
echoic storage tasks (Neisser, 1967). Each is intended to measure the time elapsing before 
an auditory stimulus has decayed to threshold in the subject's short-term sensory storage 
mechanism. 

Other possible factors suggested by previous research as influencing the span include 
longer-term learning ability, as in a serial rote-learning task (Martin, 1978), intelligence 
quotient (Jensen, 1970) and the subject's chronological age (Jacobs, 1887). The present 
study therefore also attempts to relate each of these variables to the magnitude of the span. 
In the first experiment, each subject was assessed for immediate memory span with eight 
materials (using a standard auditory presentation/vocal recall task), and for each of the 
variables given above. The span measures were then systematically correlated against each 
of these possible factors. 


Experiment 1 
Method 


Subjects. Twenty-four experimentally naive subjects were employed, paid on an hourly basis. Their 
mean age was 20-1 (SD 3-1), with a mean IQ of 119 (SD 17-4). Twenty-two were school or university 
students (non-psychologists); 12 were males. 


Test battery. The following set of tests and subtests was employed for all subjects. 

A. Memory span. For each type of material listed below, four determinations of the span were 
made During each determination, the subject heard 4, 5, 6, 7, 8, or 9 items read aloud on successive 
trials at a rate of 1 item/s from a tape-recording, followed by a momentary tone as a signal to 
attempt immediate recall of the preceding sequence The above lengths of sequence were employed in 
ascending order for the first and third determinations, but descending order for the second and 
fourth. The occasional subject who succeeded at nine items was given an extra test with 10 items 
under the same conditions. All stimulus sequences were selected randomly with replacement from the 
parent population (the Merriam-Webster dictionary in the case of the verbal items) and were used 
once only. The inter-trial interval was 15 s. Instructions to the subject, in addition to explaining the 
above experimental design and stressing the need for concentration, required him to guess at each 
item that could not be recalled. 

. binary digits (0, 1) 
. decimal digits (0—9) 
. decimal digits (3-7) 
. letters (A-Z) 
. letters (J-Q) 
. words (one syllable) 
words (two syllables) 
. words (three syllables) 
B. Subvocalization time. For each of the eight types of material used 1n Test A above, four 
determinations of subvocalization time were made for passages of a fixed length. In each 
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determination, the time to subvocalize 50 items was measured with a stopwatch. The first and third 
determinations employed silent subvocalization; the second and fourth involved whispered 
subvocalization. The subject was required to subvocalize as fast as possible in either case, reading the 
items from cards where they were presented as lists in capital letters. 

C. Echoic storage for tones. A 200 ms, 2 kHz tone was presented repetitively at about 70 dB above 
threshold, using a Marietta tone generator and Hunter timers. The inter-tone interval was vanable, 
under the subject’s control. The subject was instructed to set this interval so that each new 
presentation of the tone occurred just as his auditory memory of the preceding tone faded to zero, i.e. 
the point at which he could no longer retain an exact image of what the stimulus had sounded like, 
but merely knew that some tone had occurred. Eight determinations were made of this inter-stimulus 
interval, in alternate ascending-descending order. 

D. Echoic storage for white noise. Following the type of procedure onginated by Guttman & Julesz 
(1963), a tape-recording of continuously recycled white noise segments was played to the subject. The 
cycle time varied from 300 to 1300 ms, in steps of 100 ms. The subject listened to each segment as 
long as desired (at least 20 s), and judged whether he heard its component rhythms; the cycle time 
where this no longer occurs 1s viewed as corresponding to the echoic storage time. Allowance was 
made for reversals of judgement occurring beyond this point. 

E. Serial learning. Using a memory drum, a list of 10 low-association-value nonsense syllables 
(Hilgard, 1951, Table 9) was shown successively to the subject with each syllable preceded by its 
serial number (1, 2, 3...). Both syllables and numbers were shown at a rate of 2 s per item, and with 
a 30 s inter-trial interval. On each trial after the first, the subject attempted to anticipate verbally each 
syllable as its number was shown, until a criterion of two successive correct trials was reached. The 
dependent variable was the number of trials to criterion 

F. IQ. The Thorndike-Lorge verbal and non-verbal IQ tests were administered under standard 
conditions (forms 1 and 2, level H). 


Procedure. All testing was performed individually, requinng about 4 h per subject. Half the subjects 
were tested in the sequence listed above, and half in the reverse sequence of tests and subtests. These 
data were pooled since for no test was there a significant difference between the two groups (all 

P > 0 05, Mann-Whitney test). 


Results 


The memory span was calculated for each subject (separately for each type of material) 
from the pooled ascending and descending trials, which did not differ significantly 

(P > 0-05, Wilcoxon test). The data were scored to yield a corrected span in accordance 
with the procedure of Woodworth & Schlosberg (1954). In this, a basal sequence length 
was first noted: at this length (and below) the subject always succeeded, on all four 
repetitions. Further credit of 0-25 item was then allowed for each success with longer 
sequences and added to the basic sequence length. The uncorrected span was also found for 
each subject, representing the mean of the sequence lengths below which no error occurred 
for a given determination. (In either case, a given response was scored as correct only if all 
items in the stimulus sequence were recalled, in correct order). Overall means for subjects 
and for materials were then derived from the above spans. 

The mean subvocalization time for each subject and material was calculated separately 
for silent and for whispered conditions, since the latter measure was appreciably greater 
than the former, especially in the case of two- and three-syllable words. This is shown in 
Table 1, which also gives the mean span, averaged across subjects, for each type of 
material. 

Analysis of variance indicated that the corrected span varied significantly with both 
subjects and materials (F = 10-5, d.f. = 23, 161, P < 0-01; F = 67:9, d.f. = 7, 161, 

P « 0:01), as did the summed silent plus whispered subvocalization times (F — 5:84, 
d.f. = 23, 161, P < 0-01; F = 4-09, df. = 7, 161, P < 0-01). 
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Table 1. Experiment 1: Mean immediate memory spans (corrected and uncorrected) for 
different types of material, together with mean times (in s) to subvocalize 50 items (silent 
and whispered conditions), and corresponding values of stimulus information, in bits/item. 








Span Subvocalization time 
Stimulus ——————————— ———————————— 

Material information Corrected Uncorrected Silent Whispered 
Binary digits (0, 1) 1-00 6:63 6-50 15-88 17:31 

(1-08) (4:57) (4-29) 
Decimal digits (0-9) 3:32 6:10 6-02 15-96 18-29 

(1-09) (4-11) (4:28) 
Decimal digits (3-7) 2-32 6-33 6 26 16-54 17-23 

(0 93) (4-43) (3-66) 
Letters (A-Z) 4-70 5-78 5-69 17-05 18-90 

(0-90) (4-53) (3:71) 
Letters (J-Q) 3-00 5-71 5-65 17 15 18-63 

(1 04) (4:42) (3:65) 
Words (1 syllable) 11-94 471 4-70 18-44 21:81 

(0-65) (5:04) (4:28) 
Words (2 syllables) 12-97 4:34 4:32 20 46 25-23 

(0-76) (5:33) (5-44) 
Words (3 syllables) 12-40 3-75 375 26-35 35-98 

(0-68) (8-37) (10-91) 


Note. Standard deviations are given in parentheses; n = 24. 


Subjects. Using individual subjects’ mean scores on each test (A—F), a matrix of Pearson 
correlations between the various tests was drawn up, including chronological age as an 
additional variable. Both corrected and uncorrected span values were employed separately, 
as were silent and whispered subvocalization times, and verbal and non-verbal IQ scores. 
The span score used for each subject represented his mean value, averaged across materials, 
as did the two subvocalization measures. The correlation matrix is given as Table 2; these 
coefficients require a value of at least 0-344 for statistical significance at the 0-05 level, or 
0-472 to reach the 0-01 level, being based on 22 d.f. 

Regarding first the prediction of individual subjects' spans from their scores on the other 
tests, Table 2 shows that the best predictor of a subject's mean span is his mean 
subvocalization time, either silent or whispered. (For simplicity, the results will be 
discussed in terms of the corrected span and whispered subvocalization time values rather 
than the alternative scores available unless stated otherwise; the same conclusions hold, 
except where specified). The above combination yields a correlation of —0-751, which gives 
some support to the proposed subvocalization model. 

Since neither of the echoic storage measures correlated significantly with the span, this 
theoretical prediction was not supported. The span also failed to show any correlation with 
longer-term memory as shown in serial learning. Variability of performance in the latter 
task was markedly higher than in the span task (mean trials to criterion = 4-29, 

SD = 1-94). Since the grand mean span was 5:39 items, with a standard deviation between 
subjects of 0-699, the relative variability of the serial learning scores was 3-5 times as great 
as for the span. By contrast, the relative variability for span scores and for subvocalization 
times was almost the same (13 per cent and 15 per cent respectively), though this may be 
fortuitous. 

The only other factors besides subvocalization time which were significantly correlated 
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with the span were IQ and age. The significant relationship (P < 0-01) between verbal IQ 
and the span exceeds that of non-verbal IQ (P < 0-05). This suggests that the apparent 
effects of IQ could occur merely because it is a variable which is confounded with verbal 
fluency in general, or subvocalization time in particular. The latter hypothesis receives 
some support from Table 2, since the predicted negative correlation between IQ and 
subvocalization time is observed in all four cases (verbal and non-verbal IQ versus silent 
and whispered subvocalization time). Each of these four correlations is statistically 
significant (P < 0-05 or P < 0-01). 

Examining the above point further, the hypothesis that IQ may be essentially irrelevant 
to the span is supported by an examination of partial correlations. Thus the span does not 
vary significantly with either verbal or non-verbal IQ if subvocalization time is held 
constant; the partial correlations are respectively 0-277 and 0-104. Conversely, a test of the 
multiple correlation existing between the span and subvocalization time plus IQ showed 
that the joint predictive power of these two variables was virtually identical to that of 
subvocalization time alone, since the latter coefficient (0-751) was raised to only 0-787 or 
0:756 by including verbal or non-verbal IQ respectively as a second predictor variable. The 
data therefore do not suggest a causal relation between intelligence and the span. 

Returning to age, the only other significant predictor of the span (uncorrected scores 
only, with P « 0-05), the pattern of data is analogous to that for IQ. Thus, age 1s 
confounded with subvocalization time (whispered only, P « 0:01), and when the latter 
variable is held constant by means of partial correlation, the association between age and 
the span effectively vanishes (dropping to 0-08). Likewise, when multiple correlation is 
employed with age as the second variable, the association between span and 
subvocalization is again raised only trivially (from 0-751 to 0-754). Age per se appears to be 
irrelevant to the memory span. 


Materials. A second approach to the data is to consider span scores for the eight types 

of material employed here, pooled across subjects. When the data are classified according 
to type of material, as in Table 1, the correlations between span values and subvocalization 
times for a given type of material clearly support the previous conclusions, as may be seen 
from Table 3. The correlations in Table 3 are all strongly in the direction predicted by the 
experimental hypothesis. Checking this finding for consistency, a separate correlation was 
also calculated within each subject between his spans and his subvocalization times for the 
eight materials. All 48 coefficients were strongly negative (mean r between span and 
whispered subvocalization time = —0-80; or —0-75 using the silent estimates). 

However, the materials may alternatively be classified in relation to their information 
content per item. (Obviously, various other classifications of the stimuli are possible.) Since 
the information content of a word, for example, is higher than that of a binary digit, the 
results given above could be due to this variable. The information values, given in Table 1, 
are simply log, n, where n is the number of possible alternatives for a single stimulus item. 
In the case of the verbal stimuli, this was calculated on a whole-word basis (n then being 
the estimated number of one-, two- or three-syllable words in the dictionary used: 3930, 
8020 and 5410 respectively). When the span is then correlated against these information 
values, high negative coefficients result, as shown in Table 4 (P « 0-01). While this result 
does not negate the previous conclusions regarding the importance of subvocalization rate 
(information values, being the same for all subjects, cannot account for the inter-subject 
differences that were found), it requires analysis. 

In Table 4 are shown the partial and multiple correlation coefficients relevant for 
assessing the interaction of subvocalization time and information content as determinants 
of the span. From the former it is clear that when either predictor variable is held constant, 


Immediate memory span 531 


Table 3. Experiment 1: Correlations between memory span and subvocalization times, 
based on scores for the eight types of material, averaged across subjects 








Subvocalization Subvocalization 

time (whispered) time (silent) 
Corrected span —0-902** —0-808** 
Uncorrected span —0-905** —0-900** 


** Significant.at 0-01 level. 


Table 4. Experiment 1: Correlations between span size and the information content of a 
given type of material, plus partial and multiple correlations linking memory span with 
subvocalization times and information content (data averaged across subjects) 








Criterion variable Predictor variable(s) Correlation 
Corrected span Information content —0-93 
Uncorrected span Information content —0-96 
Corrected span Information content (silent —0-83 
s subvocalization time constant) 
Corrected span Information content (whispered —0-87 
subvocalization time constant) 
Corrected span Silent subvocalization time — 0:45% 
(information content constant) 
Corrected span Whispered subvocalization time —0-82 
(information content constant) 
Corrected span Information content and silent —0-95 
subvocalization time 
Corrected span Information content and whispered —0-98 
subvocalization time 
Subvocalization time Information content 0:75 
(silent) 
Subvocalization time Information content 0-76 
(whispered) 





* Non-significant; all other correlations are significant at 0-01 level. 


the other alone determines the span closely (provided that whispered rather than silent 
subvocalization times are used). From the multiple correlations it follows that the two 
variables together determine the span virtually perfectly (r = —0-98). Table 4 further shows 
that significant correlations also exist between information value and subvocalization time. 

To quantify the above relationships between the span, subvocalization time and stimulus 
information, multiple regression analyses were performed. These indicated that the best 
description of the span for a given material is according to the formula: 


Spdn = 3-46+ 1-06 WSR —0-097 H items (1) 


where WSR is the whispered subvocalization rate for a given material (the mean number of 
items whispered per s) and H is the information content of that material (bits per item). 
The multiple correlation between observed and predicted span sizes, using this formula, 
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was 0:9904. Obtained spans for the eight materials are compared with the values expected 
from equation (1) in Fig. 1 below. 
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Figure 1. Experiment 1: relationship between the observed span sizes for eight materials and the 
corresponding spans expected on the basis of equation (1). 


Alternatively, equally accurate predictions can be made from the equation 
Span = 2:95-- 1-15 SSR — 0-11 H items Q) 


where SSR is the silent subvocalization rate for a given material (items per s) and H is its 
information content (bits per item). 

We note, as an incidental and unexplained feature of the data (Table 2), the association 
of verbal IQ with white noise echoic storage and with age. 


Discussion 

The data of this experiment suggest that span size is a positive function of the rate at which 
the subject can subvocalize the stimulus items and an inverse function of their information 
content. It appears however to be essentially independent of age, IQ, serial learning ability 
and individual differences in echoic storage (at least as measured). 

No comprehensive theory at present appears able to predict precise span sizes under 
different conditions. The size of the span is not a constant number of items across subjects 
or materials. Nor does the span represent a simple constant-information process (Pollack, 
1953); raising the information content per item by a factor of 12 here decreased the span by 
only 44 per cent (Table 1). However, Crossman (1961) has suggested that the span size for 
different materials reflects the transmission of an approximately constant amount of 
information if both order and item information are considered, i.e. 


m log,n--log,m! = constant 


where m is the mean span for a given material and n is the stimulus set size for that 
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material. Using 10 materials, Crossman found a mean transmission rate of 28-72 bits per 
trial, with a standard deviation (between materials) of 6:32 bits, or 22 per cent. Despite the 
conceptual problems of Crossman’s formulation (Welford, 1968), no obviously superior 
predictor of span size is immediately apparent. 

The present data, if analysed according to Crossman’s formula (Table 5), indicate a 
mean transmission rate of 34-43 bits per trial, with an SD of 14-67 bits between materials, 
or 42 per cent. By contrast, if we convert the same data to yield the time that would be 
needed to subvocalize the number of items comprising the span with a given material, for 
silent and for whispered estimates, we obtain the approximately constant values shown in 
Table 5 (TS-silent and TS-whispered). These times show standard deviations of 0-131 s 
and 0:197 s for silent and for whispered estimates, representing 6-7 per cent and 8:8 per 
cent respectively of their grand means. These values are much closer to constancy than are 
Crossman's, thus permitting a more accurate prediction of the span. They also broadly 
resemble the finding of Baddeley et al. (1975), that the span for words equalled the number 
of items that could be read aloud in approximately 1:8 s (Expt VI). 


Table 5. Experiment 1: Information transmitted (HC) from different stimulus materials, 
according to Crossman's (1961) formula; also time (in s) needed to subvocalize the number 
of items comprising the span for silent and for whispered estimates (TS-silent and 
TS-whispered) 








Material HC TS-silent TS-whispered 
Binary digits (0, 1) 17-83 2.11 2-29 
Decimal digits (0—9) 29-65 1-95 2:23 
Decimal digits (3-7) 24 48 2-09 2:18 
Letters (A-Z) 36 06 1:97 2:18 
Letters (J-Q) 25 73 1-96 2:13 
Words (1 syllable) 29:36 1:74 2:05 
Words (2 syllables) 61-64 1-78 2:19 
Words (3 syllables) 50-70 1-98 2-70 
Mean 34 43 195 2:24 
SD 14-67 0-131 0-197 








However, while the above constant-time principle provides a convenient and 
straightforward rule, the previous multiple regression analyses indicated that the optimal 
prediction of span size involves the use of an equation where subvocalization rate accounts 
directly for only part of the total span, due to the presence of a constant. The information 
content of the material must also be considered in using this approach, although its 
influence is modest and its regression coefficient only about one-tenth as great as that for 
subvocalization rate. Altogether, subvocalization time, or its inverse (rate), emerges as the 
most significant variable in this study. 

Two further experiments were performed to examine further the effects of 
subvocalization rate on the span. 


Experiment 2 


This experiment attempted to replicate the previous results with a more homogeneous group 
of subjects. À secondary aim, which was not realized, was to produce changes in 
subvocalization rate by ingestion of alcohol, and to note any corresponding alterations in 
memory span. 
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Method 
Subjects. Twenty experimentally naive volunteers were employed as subjects. All were university 
students, with a mean age of 20-5 yr (SD 0-93), and had previous drinking experience. Six were 
females. 


Materials. The same test stimuli were employed as ın Expt 1, to measure spans and subvocalization 
times for each of eight materials. 


Procedure. Each subject was tested in four sessions of about 1-5 h each. Span size was measured in 
two sessions (one of them preceded by alcohol ingestion) and subvocalization times in the remaining 
two (one preceded by alcohol); these sessions were given in random order. Four measures of span 
were made in a session for each type of material, or four measures of subvocalization time (two silent, 
two whispered, each for 50 items) repeating the testing procedures of Expt 1 in all details. Under 
alcohol conditions, the subject drank absolute alcohol (1:1 ml per kg body weight), mixed with 

100 ml of orange juice, 30 min before testing commenced. A placebo was given before the other 
(normal) sessions, consisting of 200 ml of orange juice on which was floated a teaspoonful of whisky. 
The subjects had fasted for 4 h before each session, and expected to consume alcohol on each 
occasion. 


Results 


The mean corrected span and subvocalization times were calculated for each subject as 
before. No significant changes were found between the alcohol and normal conditions in 
silent subvocalization time, whispered subvocalization time, or the memory span (all 

P 0-15, by t tests); the two estimates of the span differed by less than 3 per cent. All 
scores were therefore pooled across alcohol and normal sessions. (Subsequent analyses were 
checked for consistency between these two subsets of data.) 

The correlations across subjects between mean span size and whispered or silent 
subvocalization times were —0-48 and — 0-41 respectively. These values, while again 
statistically significant (P « 0-05), are smaller than those of Expt 1, perhaps because the 
inter-subject variance in subvocalization times was considerably lower here (11-6 versus 
21-7 for silent estimates; 14-4 versus 22-8 for whispered). 


Table 6. Experiment 2: Mean corrected span across subjects for eight materials, plus mean 
times (in s) to subvocalize 50 items (silent and whispered conditions); also estimates of the 
time (in s) needed to subvocalize the number of items comprising the span (TS-silent and 
TS-whispered) 





Subvocalization time 











TS- TS- 
Material Meanspan Silent Whispered silent whispered 
Binary digits (0, 1) 5-66 14-98 16-77 1-70 1-90 
Decimal digits (0-9) 5:30 14-92 16:24 158 1-72 
Decimal digits (3-7) 5-45 14-94 16 71 1:63 1-82 
Letters (A-Z) 5-03 15-47 18-19 1-56 1-83 
Letters (J-Q) 492 15-47 17-77 1-52 1-75 
Words (1 syllable) 3-93 18-03 21-10 1-42 1-66 
Words (2 syllables) 3-78 19-68 25.19 1-49 1-90 
Words (3 syllables) 3-16 23-15 34-94 1-46 2-21 
Mean 4-654 17-08 20-86 1:545 1-849 


SD 0-910 3-01 6:43 0-092 0-169 
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The mean span showed a correlation, across materials, of —0-91 with the whispered 
subvocalization time, and — 0-95 with the silent time (both P < 0-01), as shown in Table 6. 

Multiple regression analyses of these data indicated that the best description of the span, 
accounting for 97-5 per cent of the variance, was given by the expression 


Span = 3-14 -- 0-817 WSR —0-088 H items (3) 


where WSR is the whispered subvocalization rate for a given material (items whispered 
per s), and H is the information content for that material (bits per item). Alternatively, the 
equation 


Span = 1-40 + 1:24 SSR —0-070 H items (4) 


where SSR is the silent subvocalization rate for a given material, accounted for 97-4 per 
cent of the variance. 

Also shown in Table 6 are the times required to subvocalize silently or aloud the number 
of items comprising the span (TS-silent and TS-whispered), estimated from the 50-item 
subvocalization times. The values given for TS-silent vary by only 5-9 per cent around their 
mean value (9-1 per cent for TS-whispered), and the absolute values of both variables 
resemble the corresponding times found by Baddeley et al. (1975). However, these values for 
TS-silent and TS-whispered are noticeably lower than the times obtained under supposedly 
identical conditions in Expt 1 (t = 7-1 and 4:3 respectively, d.f. = 14, both P < 0-01), 
probably because of differences in the subject population. Otherwise, the data appear 
generally to replicate the results of the first experiment. The failure to obtain any effect 
from alcohol is not entirely unexpected, even though each subject consumed almost the 
equivalent of a bottle of wine, since alcohol-induced memory impairment is typically 
minimized as the retention period is shortened (Jones & Jones, 1977). Weingartner & 
Murphy (1977) likewise failed to find any effect of alcohol on immediate recall, although 
recall after 20 min of retention was sharply reduced. 


Experiment 3 


To test further the present findings, it was decided to examine the memory spans of 
bilingual subjects. Experimental evidence and everyday observation indicate that 
individuals who speak a second language with less than perfect fluency often show a 
reduced span in that language (Kassum, 1967; Pimsleur, 1971) even when tested with items 
that are easily recognized and articulated, such as digits. A test given in a subject's second 
language will be referred to here as involving a ‘heterophone’ task, in contrast to the 
normal or ‘homophone’ situation where he is tested in his mother tongue. It was predicted 
from the previous results that the span size 1n English and in French for bilinguals would 
be a direct function of their individual subvocalization rates in these languages. However, 
in order to minimize any effects on span resulting from differential familiarity or 
meaningfulness of items in the two languages (Watkins, 1977), all the test items were 
nonsense syllables that were spoken with either an English or a French pronunciation. 


Method 


Subjects. Fourteen university students were employed; all were functionally bilingual in English and 
in French. The mother tongue of seven subjects was French, the remainder being from English 
backgrounds. The group comprised eight males and six females. 


Materials. The test items were all drawn from a common pool of 1800 CVC nonsense syllables 
ranging in association value from 0-100 per cent (Hilgard, 1951, Table 8). No item was used if it 
resembled a word in English or in French. Sequences of these 1tems were tape-recorded in both 
languages, using the same totally bilingual female speaker in either case, at a rate of | item/s. The 
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sequences ranged from 2 to 5 items, with a momentary tone at the end of each to initiate recall 
during the 15 s before the next sequence commenced. 


Procedure. Ten determinations of the span were made for each subject in English, and 10 in French. 
Each involved the presentation on successive trials of 2, 3, 4 and 5 nonsense syllables (in alternate 
ascending and descending order) The subject gave immediate verbal recall of each test sequence, and 
followed the pronunciation used ın the recording. Eight measures were then taken of the time to 
subvocalize 50 items typed in capitals: 2 under each combination of English versus French 
pronunciation, and silent versus whispered conditions, in random sequence. The experimenter noted 
these times with a stopwatch. 


Results 

The silent and whispered subvocalization times and the mean span are shown in Table 7, 
for subjects with English backgrounds and for those with French as their mother tongue. 
As Table 7 indicates, the spans here are quite low, even for homophone conditions, and fall 
below those previously obtained for three-syllable words. An analysis of variance for span 
size indicated that while no significant effect overall was produced by the test language 
(English versus French) or mother-tongue language (both F < 1), a significant interaction 
occurred (F = 7-22, d.f. = 1, 24, P « 0-01), such that spans were longer when measured in 
the mother tongue. The mean value for all these 14 homophone spans was 3:33 items, 
significantly exceeding the mean for heterophone conditions of 3-04 (t = 2-74, d.f. = 13, 

P « 0:01). 


'Table 7. Experiment 3: Mean spans under homophone and heterophone conditions, with 
corresponding mean subvocalization times for 50 items (silent and whispered), and 
estimated times needed to subvocalize the number of items comprising the span (TS-silent 
and TS-whispered) 








Subjects Subvocalization time 
Form of mother Mean — TS- TS- 
test tongue span Silent Whispered silent whispered 
English English 3:36 25-43 33-71 1-71 2 25 
(n=7) (0-28) (4:74) (5:47) (0 34) (0:34) 
French 2-97 26-71 41-79 1 59 2 50 
(n — 7) (0-29) (5:59) (16-60) (0-39) (1 08) 
French English 3-11 30-29 3871 1-90 2:41 
(0 26) (7:66) (8:90) (0-52) (0-58) 
French 3-30 24-00 38-36 1-59 2-46 
(0-29) (3:32) (11-95) (0:30) (0-84) 
Homophone mean 3-33 24-72 36-04 1-65 2-36 


Heterophone mean 3.04 28-50 40 25 1-74 2-46 





Note. Standard deviations are given in parentheses. 


The obtained silent and whispered subvocalization times for 50 items were then classified 
as homophone versus heterophone values (pooling English and French subjects) and 
subjected to an analysis of variance. This indicated that the homophone times (anglophone 
subjects subvocalizing in English, plus francophones subvocalizing in French) were 
significantly lower than the heterophone values (anglophones subvocalizing in French plus 
francophones in English), with F = 4-02, d.f. = 1, 52, P — 0-05. Also, silent times were 
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significantly lower than whispered times (F = 38:8, d.f. = 1, 52, P < 0-01); there was no 
interaction between the two main effects (F < 1). As may be seen in Table 7, the whispered 
and silent subvocalization times are decreased respectively by 10 per cent and 13 per cent 
under homophone conditions (while the span is increased by 10 per cent). 

The time needed for each subject to subvocalize the number of items comprising his span 
was then calculated, for homophone and heterophone tasks, using both silent and 
whispered measures; the relevant means are shown in Table 7. An analysis of variance 
showed that there was no significant difference in these values between homophone and 
heterophone conditions (F = 0-4, d.f. = 1, 52, P > 0-25). The mean values obtained for 
TS-whispered under homophone and heterophone conditions differed by only 4 per cent; 
the two corresponding estimates for TS-silent diverged by 5 per cent. Thus, the shortening 
of the span observed under heterophone conditions is accompanied by a proportionate 
decrease in the rate of subvocalization, whether silent or whispered. This is in accord with 
expectations on the basis of the two previous experiments. The span’s low absolute 
magnitude may also plausibly be attributed to the generally elevated subvocalization times 
found here (Tables 6 and 7). Thus, the value of TS-silent, for example, remains quite 
similar in Expts 2 and 3 (at 1-55 s and 1:65 s respectively). 


General discussion and conclusions 


At the empirical level, the present data consistently indicate that the size of the memory 
span varies mainly with the rate at which the subject can subvocalize the test material. 
While the data are strictly correlational ın nature they appear sufficiently clear to suggest 
that these variables are probably also linked in a causal relationship, but in any event the 
span can be predicted with some success for different individuals and test materials by 
employing this principle. 

It also appears that other factors such as long-term learning ability, age and intelligence 
are unimportant in determining the span, though only a fairly restricted range of values 1s 
employed here. This must be qualified by noting, for example, that Craik (1971) has found 
a correlation of 0-72 between the span and estimates of secondary memory capacity. It is 
also true that the span traditionally correlates with age and IQ; but partial correlations, 
holding subvocalization rates constant, suggest this is based on the confounding of 
variables; the same proviso might conceivably apply to Craik's results. 

The data of Expts 2 and 3 generally support the above principle, though both studies are 
restricted by the relative constancy encountered in the span. Alcohol did not change 
subvocalization rates, or the span, whereas testing in a language other than the mother 
tongue slightly but significantly decreased both these variables, in a proportionate manner. 

The simplest empirical principle found here can be expresssed by noting that the time 
needed to subvocalize the number of items comprising the span is approximately constant 
for different stimulus materials (though not necessarily between different experiments); this 
result agrees with the formulation of Baddeley et ai. (1975). Alternatively, a closer 
description may be given in terms of a regression equation involving both the 
subvocalization rate and the information content of the stimuli, which accounted for 98 per 
cent of the observed variance in the span. 

In theoretical terms, the regression equations (1—4) suggest a span which comprises a 
fixed number of items (about 3, or less) plus a variable number of items that is 
proportional to subvocalization rate, minus a small factor based on information content 
per item. One may tentatively postulate that the constant items represent primary memory 
capacity, which is characteristically about this size (Martin, 1978), and that the remaining 
items of the span are at least briefly held in a verbal loop, so that faster cycling of stimulus 
items through this loop permits more of them to be maintained. The slight decrease of span 
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with information content per item, even when subvocalization rate is held constant, may 
plausibly be attributed, at least in part, to the subject’s decreased probability of guessing 
correctly the last items of a test sequence with high-information materials (Woodhead, 
1966). In the present case, on the simplest assumptions, the change from 13-bit materials 
(with negligible guessing probability) to 1-bit materials might thus add 0-5+0-5?+0-5*... 
items in the latter case, due to guessing (about 0-9 item). The regression equations (1—4) 
yield a mean coefficient for information content of 0-091; this would add 12x 0:091, or 1-1 


items to the span for 1-bit materials. 


Since the span is closely correlated with the rate at which the test material can be 
scanned 1n memory (Cavanagh, 1972), and silent estimates of subvocalization were 
generally not inferior to whispered values as memory predictors here, it ıs probably the 
speed of central control processes, rather than the motor component of subvocalization, 
that limits information processing in the span task. However, the span size is also positively 
correlated with the rate at which various materials may be read aloud (Mackworth, 1963); 
the span thus provides a noteworthy quantitative link between the diverse activities of 
memory scanning, covert rehearsal and voiced reading. 

The present emphasis upon subvocalization appears broadly compatible with many 
known properties of short-term memory, such as the disruption of performance by 
irrelevant vocalization (Peterson & Peterson, 1959), the effects of irrelevant or relevant 
articulation during storage (Murray, 1967; Richardson & Baddeley, 1975), the 
characteristics of echoic memory (Tell, 1971) and the effects of word frequency (Watkins, 
1977). Clearly, however, subjects may fall back on other coding methods when required 
(O'Connor & Hermelin, 1973; Yik, 1978); it is also difficult to account on this basis for 
phenomena such as suffix effects (Morton et al., 1971). 

Further studies would be desirable to examine the precise effects upon subvocalization 
rates of stimulus variables relevant to short-term memory, such as word length, acoustic 
confusibility and familiarity. Echoic storage under span-like conditions may also merit 
additional study, despite the present negative results, in the light of Parkinson's (1974) 
finding of a strong correlation between span size and echoic storage as inferred from 


dichotic memory performance. 
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Interaction of auditory and visual information in speech perception 
Barbara Dodd 





Two experiments investigated the role of stored auditory and visual information for unimodal speech 
perception tasks. The first experiment showed that hearing subjects performed better than deaf 
subjects on a lip-reading task, possibly because they could supplement lip-read stimuli with stored 
information derived from the auditory modality. The second experiment demonstrated that sighted 
subjects did not use stored visual information to supplement an auditory input when deleting 
mispronunciations, since their performance did not differ from that of congenitally blind subjects. 
The processing of visual (lip-read) information in speech perception is discussed. 





The ability of normally hearing subjects to make extensive use of visual (lip-read) 
information for speech perception has now been established. Experimental studies of 
lip-reading fall into three categories. One group compared the lip-reading abilities of deaf 
and hearing subjects using gross quantitative measures (Clouser, 1977; Conrad, 1977) and 
in both studies the hearing and deaf children performed equally. The second group of 
studies assessed hearing subjects' ability to discriminate visually presented CV syllables, in 
order to set up visually contrasting categories of sounds (Woodward & Barber, 1960; 
Binnie et al. 1974; Erber, 1974). 

The third group investigated the relationship between auditory and visual information in 
speech perception. One experiment, that tested the lip-reading abilities of hearing 
adolescents (Dodd, 1977) using real words, showed that subjects relied on visual 
information when white noise made speech difficult to hear. Desynchronizing the two 
inputs (hearing and vision) by 400 ms resulted in a significant increase in perceptual errors. 
However, despite the asynchrony of the two inputs in this condition, having both sources 
available provided significantly more information' than either vision alone, or masked 
hearing alone. Subjects were able to combine information extracted from the asynchronous 
inputs. When, however, vision and hearing were placed in competition in a dubbed 
condition, the contradictory nature of the stimuli resulted in an increase in errors, showing 
that subjects were unable to ignore either of the inputs. MacDonald & McGurk (1978) 
found that when auditory and visual information conflicted, subjects reported neither the 
sound they heard, nor the sound they saw, but had the illusion of perceiving another 
sound, e.g. when hearing |ba| dubbed on to the lip movements for |ga|, subjects perceived 
[da]. In general, their results were consistent with the suggestion made by Miller & Nicely 
(1955) that vision provides cues about place of articulation (e.g. bilabial, dental, velar) 
whereas hearing provides information about manner of articulation (e.g. plosion, nasality) 
and voicing. 

However, the use of masked, asynchronous and dubbed stimuli does not reflect the usual 
environment in which speech perception occurs. In another experiment (Dodd, 1979) 
children were asked to detect mispronounced words, hidden in a list of correctly 
pronounced ones, when they were allowed only to listen, and when they were able to 
observe lip movements as well as listen. The words were mispronounced by changing either 
a feature of place (e.g. kerson for person) or a feature of manner or voicing (e.g. mobent 
for moment). The results showed that children detected more mispronunciations when they 
were able to observe lip movements as well as listen, and that this superior performance 
was due to an increased ability to detect words mispronounced by a change in a feature of 
place of articulation. Thus even in quiet, face-to-face communication, visual (lip-read) 
information plays a role in speech perception. 
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This conclusion poses a problem for theories of speech perception, since the process has 
usually been considered to be specific to the auditory modality (Darwin, 1976). However, it 
is not a new problem. The ability to recognize one stimulus as being equivalent though it is 
perceived through different sensory modalities has been studied extensively, most often in 
terms of vision and touch (see O'Connor & Hermelin, 1978, for review). 

The use of representational codes allows the recoding of information perceived by one 
modality into a code derived from another modality. Conrad (1964) found that written 
words were remembered in terms of an acoustic-articulatory code; and Paivio (1969) 
showed that auditorially perceived words that had a high imagery content were stored both 
verbally and in a visual image code. O’Connor & Hermelin (1978) report several 
experiments demonstrating that normal children were able to use internal representatations 
derived from previous experience in one modality to interpret stimuli presented to another. 
Blindfolded normals used a visual image code when drawing previously traced tracks 
backwards, judging the spatial orientation of mirror image forms, and estimating the extent 
of arm movements; whereas congenitally blind subjects had to rely on tactile and 
kinaesthetic information, and thus performed poorly by comparison. 

If speech perception cannot be considered specific to the auditory modality, it is 
necessary to investigate ways in which lip-read information may be integrated with 
auditory information. One possible hypothesis is that lip-read information is recoded into a 
code derived from auditory perception. The experiments presented in this paper investigate 
the role of stored visual and auditory information for unimodal speech perception tasks. 


Experiment 1 


This experiment was designed to answer the question: Do normally hearing subjects extract 
the same information from a lip-read input as prelingually profoundly deaf subjects? Those 
studies that have compared these two subject groups’ lip-reading abilities in quantitative 
terms have found no significant difference (Clouser, 1977; Conrad, 1977). However, since 
hearing children may have stored information available about the congruence of speech 
sounds and lip movements, it is possible that they can use this information to supplement 
purely visual (lip-read) inputs, and therefore show a different pattern of errors. 


Method 


(1) Subjects. Twelve prelingually profoundly deaf subjects, who attended a secondary school for the 
deaf in London, acted as subjects. Their mean chronological age (CA) was 14 years 5 months 
(ranging from 14 years 0 months to 15 years 7 months). Subjects had a mean pure hearing loss of 
102 db (range 81—117 db) in the better ear over the speech frequency range. Deafness had occurred in 
all cases before 12 months of age, and known causes of deafness included maternal rubella, 
meningitis, thalidomide poisoning and familial deafness. There were eight male and four female 
subjects. The school encouraged children to lip-read and use spoken language. 

The deaf subjects were matched individually on a spelling test, and a word recognition test, with 12 
children who had normal hearing, from a junior school in London. The hearing subjects had a mean 
CA of 10 years 7 months (range: 9 years 8 months to 11 years 3 months). There were six male and 
six female subjects. 


(2) Procedure: Subject matching tasks : 

Pretest 1: Spelling test. Subjects were first asked to perform a sentence completion task, e.g. The 
fire...is red. There were 20 orthographically presented sentences to complete, designed to elicit 20 
selected words from the Schonell Spelling Test (1932). If the children did not know which word they 
should write, they were given semantic clues. On the rare occasions where a child could not guess the 
words, he was told ‘Write——’. The words had to be spelt perfectly to be scored as correct, e.g. build 
for built was counted as an error. 

Since the two groups were matched on this task the number of words spelt correctly did not differ. 
The mean number of words spelt correctly for the hearing group was 8-6 and for the deaf 9-5 out of a 
total of 20 (Student's t = 0-46, d.f. = 11, n.s.). 
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Pretest 2: Word recognition test. Each of the 20 words used in the spelling test was listed with three 
incorrect spellings of the word: 
e.g. engin 
engine 
endgin 
engun 
The position of the correctly spelled word was randomized. Subjects were asked to point to the 
correctly spelled word. 
The two groups were also matched on this task, and therefore the number of words recognized 
correctly did not differ. The mean number of words recognized correctly for the hearing group was 
16-6, and for the deaf group 18:2 (Student's t = 1:91, d.f. = 11, n.s.). 


(3) The experimental task. The instructions were: ‘I am going to say some words. They are not real 
words. They are nonsense words. Nonsense words are made up words, like preel. I want you to write 
the words down.' The instructions were presented in written form. Practice trials were then given to 
familiarize subjects with the task, and to ascertain that they understood what was required. Subjects 
sat opposite the examiner across a 3 ft wide table. Testing was carried out in a quiet, naturally well-lit 
room in the school building 

The deaf subjects had their hearing aids switched off before the experiment began. Normal subjects 
wore earphones especially designed for muffling sound, and only the lip movements were presented. 


(4) The stimuli. The stimuli consisted of 24 nonsense words that conformed to the rules of English 
phonology, 12 of which were of CCVC form, and 12 of CVCC form (see Appendix I). Twelve of the 
stimuli (6, CCVC and 6, CVCC) were designed to be easy to lip-read (according to data gained from 
hearing children—Dodd, 1977) in that they consisted of at least one ‘front’ consonant (p, b, m, f, v, 
0, 5) and two other consonants which were either ‘front’ consonants or ‘middle’ consonants (t, d, s, 
z I, n, f, tf, dz). Twelvelof the words were designed to be difficult to lip-read ın that they consisted 
of at least one ‘back’ consonant (k, g, n, h, j), plus two other consonants that were either ‘middle’ or 
*back' consonants. The hard and easy words were combined into one randomly ordered list. 


(5) Scoring. Since there were three consonants per word, each word was given a score out of 3, which 
indicated the number of consonants correctly represented, e.g. lisk for yisk would be scored as 2. 
Vowels were not scored since phoneme-grapheme relationships for vowels are variable. Either 
member of the voice/voiceless pairs (p/b, t/d, f/v, s/z, k/g) were scored as correct, since lip-read inputs 
give no information about voicing, e.g. fump for vomp would score 3. Where subjects interpolated an 
extra sound, e.g. skelp for skel, a point was subtracted. Thus skelp scored 3 for the inclusion of the 
letters s k l, but a point was subtracted for the p, leaving 2. Similarly, a point was subtracted for an 
extra ‘consonant’ syllable, e.g. leshain for klush scored 1. 


Table 1. Mean number of consonants written correctly for deaf and hearing subjects 


Hard words Easy words 





Deaf 11:25 18-20 
Hearing 16-33 19-75 
Results 


The mean number of consonants written correctly for deaf and hearing groups, for words 
easy and hard to lip-read, are presented in Table 1. À two-way analysis of variance was 
performed on the raw data, and revealed a significant group difference (F = 6:31, d.f. = 1, 
22, P < 0-025), a significant conditions term (F = 46-49, d.f. = 1, 22, P < 0-001) and a 
groups-by-conditions interaction (F = 5-25, d.f. = 1, 22, P « 0-05). The interaction was 
tested using the least significant difference method (Winer, 1962). The results indicate that, 
for both groups, those words easy to lip-read were less prone to error than those words 
that were hard to lip-read (Deaf: t = 6:48, d.f. = 22, P < 0-001; Hearing: t = 3-19, 
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d.f. = 22, P < 0-01) and that when writing down words hard to lip-read, the hearing were 
significantly better than the deaf (t = 3-32, d.f. = 44, P < 0-01). However, there was no 
difference between the groups for easy words (t = 1-01, d.f. = 44, n.s.). 


Discussion 


The results showed that both hearing and deaf subjects were better able to spell nonsense 
words that were designed to be easy to lip-read than words that were designed to be 
difficult to lip-read. There was a significant group difference, the hearing subjects 
performing better than the deaf subjects. However, testing of the groups-by-conditions 
interaction revealed that this group difference could be attributed to the normal group 
performing better on words that were hard to lip-read than did the deaf group. There was 
no significant group difference on words that were easy to lip-read. 

The finding that the hearing subjects performed better on hard to lip-read words than 
did the deaf subjects is not in agreement with previous studies comparing the lip-reading 
abilities of the two groups (Clouser, 1977; Conrad, 1977). The most likely explanation of 
this disagreement is the difference in the type of tasks used. The present experiment used 
single nonsense words, designed to include equal numbers of hard and easy to lip-read 
words. Conrad’s stimuli were not subdivided in this way, and since his task required 
responding by pointing to pictures described, differences between the groups might have 
been obscured. While Clouser used very gross measures of visibility, and found a small 
relationship between visibility and ease of lip-reading, he did not test for group differences 
on sentences with high (easy) and low (hard) visibility scores. 

The findings of the present experiment may be interpreted as evidence for hearing 
children being able to use stored auditory information to supplement their perception of 
lip-read speech. If they have observed from infancy that there is a congruence between lip 
movements and speech sounds (Mills, 1978; Dodd, in press a) then it is not surprising that 
their ability to detect differences between sounds that are hard to lip-read would be 
enhanced by that dual input. That is, the experience of perceiving acoustic contrasts that 
are congruent with small visual differences allowed the hearing subjects to develop visual 
discrimination of lip movements superior to that of the deaf subjects, Necessarily, the 
superior visual discriminatory ability is bound to acoustic contrasts, and the visual 
information must be recoded into a form that includes phonological information derived 
from audition. If the hearing children were relying solely on visual information, without 
access to stored phonological information, their performance should not have been better 
than that of the deaf children. 

The result cannot be attributed to the hearing gaining auditory information during the 
experiment, or to superior spelling abilities, since the groups performed similarly on words 
that were easy to lip-read. While motivational factors are crucial in any experiment, the fact 
that hard and easy words were combined into one randomly ordered list makes it unlikely 
that the deaf were less motivated for perceiving hard words than easy words. 

Nor can the results be explained in terms of a neuro-articulatory model of speech 
perception (Liberman et al. 1967). Such a model would propose that since the visual input 
provides direct information about certain aspects of articulation, primarily features of place 
(e.g. bilabial, dental, velar), then the most likely code of processing would be derived from 
articulatory information. However, in the present experiment, if the subjects were using an 
articulatory code, the hearing subjects ought to have performed better than the deaf 
subjects on both easy and hard to lip-read words, since their articulatory abilities were 
obviously superior to those of the deaf subjects. Further evidence against the possible use 
of an articulatory processing code comes from an experiment showing that when deaf 
children were asked to lip-read, repeat and then write down nonsense words, the two 
outputs (speaking and spelling) differed in a consistent manner (Dodd, in press b). 
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Thus the results seem to be most consistent with the hypothesis that hearing children are 
able to supplement their perception of lip-read speech with stored information derived 
from audition. Such information would also include intuitive knowledge of the constraints 
of English phonology. 


Experiment 2 


Experiment 1 suggested that normally hearing children could use stored information 
derived from audition to supplement their perception of lip-read speech. Experiment 2 was 
designed to answer the question: can children with normal vision use stored visual (lip-read) 
information to supplement speech perception when speech is presented only to the auditory 
modality? The task used was adapted from Cole (1973). 


Method 


Subjects. 'Ten totally congenitally blind children who attended a school for the blind in London acted 
as subjects. Their mean CA was 12 years 8 months (range: 11 years 5 months to 15 years 2 months). 
There were seven male and three female subjects. There were 16 normally sighted subjects, who 
attended a junior school in London. Their mean CA was 9 years 10 months (range: 9 years 5 months 
to 10 years 5 months). Subjects were not carefully matched for CA since the stimulus material was 
appropriate for children from 8 years, and the response was simple. 


Procedure. Each child's hand was placed on the response button and he was told: ‘You are going to 
hear a story. It is a story about a little girl called Rebecca who is in another world, and the 
adventures that she has there. Sometimes the lady who is reading the story will make a mistake. For 
example, she might say “bindow” instead of “window”. When she makes a mistake I want you to 
press this button here as quickly as you can. You will have to listen very carefully to hear all the 
mistakes. So if you think you hear a mistake, press the button. Now we'll have some practice.’ 

A short passage was read, introducing the characters in the story. It contained five 
mispronunciations and was read ‘live’. Then the tape-recorded story, read by a female adult who had 
a London accent and did not know the purpose of the experiment, was played. The story was a 
chapter from Rebecca's World by Terry Nation (1975). The experimenter marked a score sheet which 
listed the mispronunciations for each subject by marking those words which were detected. 


The Stimuli. There were 40 mispronounced words in the story which consisted of approximately 1300 

words. The 40 mispronounced words (Appendix II) had the following properties: 

(1) The error was never part of a consonant cluster. 

(2) Twenty had initial errors, e.g. pouldn’t for couldn't, and 20 had medial errors, e.g. corder for 
corner. 

(3) Ten each of these 20 words had an error due to the substitution of a sound from the same place 
of articulation, but a different manner or voicing, e.g. bekan for began, and the other 10 had an 
error due to the substitution of a sound from a different place of articulation, but the same 
manner, e.g. kanic for panic. 

(4) The substitute pairs differed by only one distinctive feature (Keyser & Halle, 1968). 

(5) The level of markedness of sounds was controlled across manner and place substitutes 
(McReynolds et al., 1974). 

(6) The direction of the substitution (e.g. p-k, k-p) was balanced across initial and medial positions. 

(7) The word frequency was controlled across place and manner substitutions according to 
Thorndike & Lorge (1944). 


Results 


A groups (blind vs. sighted) x conditions (change in place vs. change of manner feature) 
analysis of variance for unequal groups, using the unweighted means solution (Winer, 
1962, p. 31) was carried out on the raw error scores (i.e. mispronunciations not detected). 
The results showed no significant groups term (F — 0-16, d.f. — 1, 24, n.s.), i.e. the blind 
and normal groups performed equally. The conditions term was significant (F — 17-99, 
d.f. — 1, 24, P « 0-001) indicating that words mispronounced by a change in place of 
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articulation were detected significantly less often than words mispronounced by a change in 
manner of articulation. There was no significant groups-by-conditions interaction term 
(F = 0-18, d.f. = 1, 24, n.s.). The mean error scores are shown in Table 2. 


Table 2. Mean number of errors for blind and normal children on error detection task 














Normal Blind 
Change in manner 9-7 9-2 
(e.g. m-b) 
Change in place 11-7 113 
(e.g. p-k) 
Discussion 


The finding that normally sighted and totally congenitally blind children performed 
identically on an error detection task presented only to the auditory modality can be 
interpreted as an indication that lip-read features were not used by the normally sighted 
subjects to supplement auditory speech information. It seems that in this task both groups 
of children relied more on features of manner of articulation or voicing in order to detect 
mispronunciations. This finding fits with the results of experiments (Dodd, 1977; 
MacDonald & McGurk, 1978) demonstrating that audition gives more precise information 
about manner of articulation and voicing, and vision provides cues about place of 
articulation. 

The general result concurs with the conclusions drawn by O'Connor & Hermelin (1978). 
They suggest that if stimuli are presented to the modality most appropriate for their 
processing (in this instance audition), there is no need for the information to be recoded, 
and supplemented by stored information from other modalities (in this case vision). 

While stored visual (lip-read) information did not seem to be used in the task described in 
Expt 2, possibly because error detection tasks provide little data about the modality source 
of information, the results of other studies have shown that visual information can be 
stored for use in different types of tasks. Experiments that have tested the lip-reading 
abilities of normally hearing children have shown that they can recognize real words 
(Clouser, 1977; Conrad, 1977; Dodd, 1977), and decode nonsense words and represent them 
graphemically (Expt 1). Other experiments (Dodd & Hermelin, 1977) have shown that 
prelingually profoundly deaf subjects can recognize rhymes and match homophones on 
information abstracted from lip-reading alone. That is, the deaf can derive a phonological 
code from a visual (lip-read) input. 


Conclusions 


Two experiments investigated the roles of stored auditory and visual (lip-read) information 
in unimodal speech perception tasks. The first experiment compared the lip-reading 
abilities of normal and deaf children, and found that hearing subjects could lip-read words 
designed to be difficult to lip-read better than deaf subjects. The two groups performed 
equally well on words designed to be easy to lip-read. This finding was interpreted as 
evidence for hearing subjects being able to recode visual information into a code derived 
from audition, i.e. the lip-read information was supplemented by stored auditory 
information. The second experiment investigated whether stored lip-read information could 
be used to supplement speech presented only to the auditory modality. The results showed 
that blind and sighted children performed similarly, indicating that stored lip-read features 
are not used to supplement auditory information in the type of task used. 
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Since speech perception can involve processing information from the visual as well as the 
auditory modality, the way in which these two sources of information are integrated must 
be explained. MacDonald & McGurk (1978) have pointed out that no current theory of 
speech perception allows for the processing of lip-read information. They suggest that it is 
possible to modify the motor theory of speech perception (Liberman et al., 1967) and the 
analysis by synthesis model (Stevens & House, 1972) to include visual information since 
both already assume integration between acoustic and articulatory information. However, 
the phonological skills of deaf subjects, such as identifying rhyme and matching 
homophones, have been shown to be related primarily to lip-reading abilities, and not to 
articulatory skills (Dodd & Hermelin, 1977). While the motor theory of speech perception 
does not depend on the quality of the sounds produced, the mismatch between deaf 
subjects’ perceptual and productive abilities (Dodd, in press 5) suggests that an alternative 
hypothesis should be considered. 

An hypothesis more compatible with experimental evidence would propose that lip-read 
stimuli are encoded into a phonological code. A phonological code is usually conceived of 
as a ‘speech related’ code (Spoehr, 1978) derived either from acoustic or articulatory 
information. However, since the deaf can derive a phonological code from visual (lip-read) 
inputs, there is already evidence that such a code can use visual speech information. Thus 
the concept of a phonological code might be expanded. A phonological code may be a 
non-modality specific code, capable of processing and combining information derived from 
audition, vision (both lip-read and graphic stimuli) and proprioception. The results of 
studies of the lip-reading abilities of deaf and hearing subjects (see introduction) would 
appear to fit this hypothesis. In terms of Expt 1, the hearing subjects' phonological code, 
being derived from both auditory and visual modalities, may be able to use stored auditory 
information in processing lip-read speech. 

The finding that prelingually profoundly deaf children were able to lip-read a nonsense 
word and represent it graphically provides further evidence that deafness does not preclude 
the use of a phonological code for performing complex mental operations, and indicates 
that lip-reading provides a crucial source of information for deaf children. Future research 
on various aspects of lip-reading may provide data that will allow the skill to be acquired 


more successfully and extensively. 
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Appendix I 

Nonsense words used as stimuli for lip-reading task 
Easy to lip-read Hard to lip-read 
multh klong 

snam klug 

milt klush 

shusp Steeg 

dasp skel 

vomp yisk 

thrib yist 

pruv golsh 

flun trog 

bloot tosk 

theld gank 

slath chank 
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Appendix II 

Same place Initial Medial 
m-b banaged mobent 
b-m mubble remecca 
k-g goming eghoed 
g-k kurgle bekan 
d-n nescribe harnly 
n-d doticing corder 
f-v vamily convused 
v-f fanished efil 

fA chould vaniched 
tf-f shanged pershed 
Same manner 

b-g gody timger 
g-b bhosts forbotten 
J-s soulder smasing 
s-f shafety closher 
p-k kanic gaking 
k-p pouldn't maping 
m-n nillion inagine 
n-m moises mamy 
r-l lunning dilection 
l-r raunched neary 
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Sex difference in choice reaction time 


Ali A. Landauer, Simon Armstrong and Joanne Digwood 








The decision and movement time components of a visual choice reaction-time task were examined 
using students and visitors to a university exhibition. The results of two separate studies showed that 
women have a faster decision time than men, and that men have a faster movement time. Since these 
two effects are in an opposite direction, no sex differences in the mean choice reaction times were 
found. It 1s concluded that on this particular task the cognitive performance of women is superior 














Donders (1868) was first to point out that choice reaction time could be divided into two 
separate components: decision time and movement time. The decision-time component of 
reaction time has been called ‘reaction time’ (Woodworth, 1938; Henry, 1952) and, more 
recently, 'initiation time' (Kerr, 1979). Decision time, which will be the term used here, is 
the time between the onset of the stimulus and the initiation of the response to the 
stimulus. Decision time is thought to represent the central or cognitive aspects of the 
decision-making process (Welford, 1968). Movement time, on the other hand, seems 
primarily to represent a motor process. It is defined as the interval between the initiation of 
the movement response to the stimulus and its completion. It has been identified as the 
peripheral component of the total reaction time and reflects the activity of the effectors 
(Welford, 1968). 

Sex differences in reaction time have been reviewed by a number of the authors 
(Maccoby & Jacklin, 1974; McGuinness, 1976). For simple reaction-time tasks adult men 
have been found superior to women of all ages (Maccoby & Jacklin, 1974). In choice 
reaction-time tasks girls under the age of 11 are faster than equally aged boys; this 
difference becomes larger as the number of choices increases. Adult men and women 
perform equally on choice reaction-time tasks (Fairweather & Hutt, 1972). 

In numerous experimental investigations little or no relationship has been found between 
decision and movement time (Henry, 1961; Landauer et al., 1979). The vast majority of 
choice reaction-time investigations examined the total aspect of the task: from the onset of 
the stimulus to the completion of the response. 

In an examination of the effect of varying doses of alcohol on both the means and 
variances of a number of trials of a choice reaction-time task, one incidental finding was 
that the decision time of women was significantly faster than that of men, while women's 
movement time was significantly slower than that of men (Landauer & Adamson, 1980). 
Since these two effects when added are in an opposite direction to each other, they are 
complementary. Thus the mean total reaction time of men and women is very similar and 
confirms the finding by Fairweather & Hutt (1972). 

Since movement time primarily represents the muscular effector components in a 
reaction-time task, it is reasonable that males, who from puberty onwards are usually more 
muscular than females, would have a faster movement time (McGuinness, 1976). After 
correcting for body size the muscular strength of women is on an average 20 per cent less 
than that of men (Astrand & Rodahl, 1970). Since adult men are found to be faster or no 
different from women in performing choice reaction-time tasks (Fairweather & Hutt, 1972) 
1t follows that to compensate for their slower movement time, women's decision times must 
be faster. Sherman (1978) reviewed sex differences of cognitive abilities and critically 
examined the empirical findings and main explanatory hypotheses which have been advanced 
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to confirm or deny that there are sex differences in cognitive abilities. She concludes that 
though sex-related differences are small, there are some biologically based cognitive 
differences between women and men. 

The present investigation was made for two reasons. In the first place it was decided to 
perform an experiment in which sex differences are formally postulated in a choice 
reaction-time experiment in which no alcohol is given to subjects. In the second place it 
was decided to use both a male and a female experimenter to jointly conduct an 
experiment, since it might have been possible that the differences observed could be 
attributed to the presence of a male experimenter. As Fairweather (1976) points out, the 
sex of the experimenter can have a significant effect on results, yet this variable is only 
rarely considered. 


Method 
Apparatus 


The apparatus consisted of two parts: a control panel operated by the experimenter, and a display 
panel for subjects which 1s shown in Fig. 1. 





Figure 1. Diagram of the display panel 


The display of the apparatus (the part with the number 8 showing) consisted of an IEE Senes 0080 
Read-Out-Display-Panel which was 76 mm wide and 102 mm high. On it the digits 0 to 9 could be 
shown at the eye level of the subject. Viewing distance was 600 mm. On the lower centre of the 
460 mm wide and 280 mm long sloped response panel there was a push-button with a highly sensitive 
microswitch. Subjects were told to keep the index finger of their favoured hand resting on this button. 
A semicircular array of 10 equally spaced buttons labelled from 0 to 9 was situated 180 mm from the 
rest-button. Subjects were instructed that whenever a number appeared on the display panel it should 
be extinguished as quickly as possible by moving their finger from the rest-button and depressing the 
correspondingly labelled response-button. 

The experimenter who sat 3 m behind the subject activated one of the 10 digits and recorded the 
subject’s responses on two separate millisecond digital timers. The first of these recorded the time 
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elapsed from the appearance of the digit on the display panel until the subject’s finger was lifted from 
the rest-button. This represents decision time. The second timer was activated the moment the first 
had stopped and recorded the time from the beginning of the response movement until the correct 
button was depressed. This represents movement time. It should be noted that decision time includes 
a small movement component.! Some muscular activity must occur to reduce the pressure on the 
microswitch of the rest-button. This movement component of the decision time is relatively minor 
(lifting the index finger by 2 mm) and is biased against the experimental hypothesis. If women have a 
faster decision time, and this occurs despite the inclusion of a minor movement component in which 
men are superior, then the true decision time is underestimated by the test. 


Procedure 


After subjects were seated in front of the display panel standardized instructions were read to them. 
These stressed that they had to respond as fast as possible by depressing the button corresponding to 
the number shown on the screen. The subjects were reminded to use one hand only and to return the 
finger to the rest-button after making a response. It was also pointed out that the numbers 1 and 0 
were on the opposite side of the response-button array. Five practice trials were then given. 

In the first experiment 30 trials were given, each digit was presented three times and the same 
random order was used for all subjects. In the second experiment 40 trials were given, each digit was 
shown four times and the same random order was used for all subjects. The mean decision, 
movement and reaction times were calculated for each subject and formed the data of this 
investigation: obviously, reaction time was the sum of movement and decision time. 

In the experiments reported here no subject gave a wrong response. These errors manifest 
themselves by a large increase in the movement time to a single trial since the timer is stopped only 
when the correct button is depressed. In experiments with intoxicated subjects (Landauer & 
Adamson, 1980) this occasionally happens. 


Subjects 


Subjects 1n Expt 1 were visitors to a university exhibition who volunteered to participate in an 
experiment which measured their reaction time. This experiment was run by one male experimenter 
only (A.A.L.).'The results of young schoolchildren and persons who appeared to be over 30 years of 
age and who volunteered to participate in the experiment were not recorded. The results of Expt 1 
are based on the mean responses of 11 men and 12 women. 

The subjects in Expt 2 were undergraduate students, most of whom lived in one of the residential 
colleges attached to the University of Western Australia. There were 20 men and 20 women in this 
group and all were tested in the presence of both a female (J.D.) and a male (S.A.) experimenter, 


Results 


The experimental results are shown in Table 1. The differences between the results of the 
two experiments were not investigated: the data are based on a different number of trials 


Table 1. Means and standard deviations (in ms) for decision, movement and reaction times 
in Expts 1 and 2 








Decision time Movement time Reaction time 
Men Women Men Women Men Women 
Experiment 1 
Mean 581 485 340 446 92] 931 
SD 75.6 44-9 71:3 870 110-8 80-6 
Experiment 2 
Mean 585 534 325 364 910 898 
SD 61:6 56:2 59-2 56:3 87-0 63-9 





Note. The means in Expt 1 are based on 30 trials by 23 subjects, while the means in Expt 2 are based 
on 40 trials by 40 subjects. 
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and owing to practice effect and population differences it is likely that variances could 
differ. 

In Expt 1 women hada significantly faster decision time (t = 3:56, d.f. = 21, P < 0-01) 
and a significantly slower movement time (t = 3:02, d.f. = 21, P < = 0-01) than men. 
There were no significant differences between sexes as far as the overall reaction time was 
concerned (1 = 0-23, d.f. = 21, P > 0-05). 

Similar results were obtained in Expt 2. Women had a significantly faster decision time 
(t = 2-67, d.f. = 38, P < 0-05) and a significantly slower movement time (t = 2-10, 

d.f. — 38, P « — 0:05) than men. Again no significant sex difference was found for the total 
reaction time (1 = 0-48, d.f. = 38, P > 0-05). 


Discussion and conclusion 


The results of the experiments support the hypotheses that women have a significantly 
faster decision time than men, that men have a significantly faster movement time than 
women, and that since these two differences popne each other, reaction time remains 
relatively constant between sexes. 

Welford (1968) distinguishes three stages of reaction time: (a) the time it takes for the 
stimulus to actıvate the sense organ and for the neural impulses to reach the brain; (b) the 
time taken by the central process to identify the stimulus and signal the effector to initiate 
the response; and (c) the time required to energize the effectors and produce the overt 
response. In this study, decision time represents the first two stages of Welford’s model. 

Wargo (1967) believes that the central processes represented by decision time can be 
further divided into perceptual and cognitive components. Decision time, according to this 
model, can therefore be thought to consist of (a) receptor and afferent transmission, (b) a 
perceptual, and (c) a cognitive component. 

It is therefore possible that the faster decision time observed in women may be due to 
faster neural transmission, faster identification of the stimulus, faster decision processing, or 
any combination of these factors. 

The neural physiologies of women and men do not differ sufficiently to allow for the 
relatively large differences in decision time which have been observed to be accounted for 
by differences in the receptor structure or by changes in the speed of neural transmission. It 
is generally recognized that 60 ms is the minimum time required by subjects to separate 
successive stimuli (Wargo, 1967). None of the studies dealing with this observation mention 
significant sex differences in this time span. In speeded matching tasks a sex difference 
favouring women has generally been attributed to the superior perceptual speed of women 
(Very, 1967; Garai & Scheinfeld, 1968). As Majeres (1977) points out, it is difficult to 
distinguish between perceptual and cognitive components in these tasks. In a number of 
experiments he showed that sex differences in performance are dependent on response 
rather than on stimulus variables. He found that women are faster in processing incoming 
stimuli and in making decisions. 

The data from the present investigations, together with the findings by Majeres (1977), 
make it plausible to postulate that the superior performance by women may be due to 
quicker decisions and to better planning strategies. The results also suggest that this 
simpler explanation applies if reaction time is viewed in the classical manner suggested by 
Donders over a century ago. Assuming decision time represents mainly a type of cognitive 
process at which women perform better (Sherman, 1978) and movement represents motor 
processes at which men, owing to muscular development, are superior, then results 
obtained in experimental investigations into sex differences of choice reaction-time tasks 
will be determined by the relative amount of cognitive and physical processes used in a 
specific experimental situation. 
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This interpretation also explains why post-pubertal males are faster than females in 
simple reaction-time tasks (Maccoby & Jacklin, 1974). Simple reaction time involves few if 
any cognitive processes. Since females have a faster reaction time than males in the 
pre-pubertal age groups (Hodgkins, 1963; Fairweather & Hutt, 1972) it seems reasonable 
to suggest that women of all ages have a faster decision time than men: that is, they make 
decisions quicker and use better planning strategies. The findings of this investigation give 
rise to a number of plausible predictions which can be made about female superiority in the 
cognitive components of reaction time. For instance, as the number of alternative responses 
in a Sternberg (1966) type task is increased, one could reasonably expect that the slope of 
the function of mean reaction time plotted against positive set size will be steeper for men 
than for women. This will be more prominent if the muscular movement component of the 


response is kept as small as possible. Whether the superior cognitive ability of women is 
due to metabolic, hormonal, cultural or social reasons, or whether it is caused by some 
X-linked genetic factors, is beyond the scope of this investigation. 
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Can alphabet recall be part of a visuo-spatial task? 


Miranda Hughes, Mona Wilson-Derose and Bernadette Kiely 








Fifty male and 50 female undergraduates were subjects in an experiment where they were required to 
recall the alphabet and to count either the number of letters containing the sound ‘ee’ or the number 
of letters with curves in the upper-case form. The results fail to replicate those of Coltheart er al. 
(1975), but are consistent with current models of hemispheric specialization. Males were found to 
complete the visuo-spatial task slower than the females and it is argued that this is due to their 
greater degree of hemispheric specialization which necessitates transfer of information between 
hemispheres. 





The evidence for the fact that humans have left-hemisphere (LH) specialization for 
language and right-hemisphere (RH) specialization for visuo-spatial skills is now well 
documented (Segalowitz & Gruber, 1977). Recent experimental work has focused on the 
question of whether there may be sex differences in hemispheric specialization (e.g. 
Lansdell, 1961; Davidson et al., 1976; Ray et al., 1976); and, in a recent review of this 
work, Hutt (1979) has collated a body of evidence which supports the view that there is 
greater specialization of the RH for visuo-spatial processes in men than in women, 
although the case for differential LH specialization is less strong. 

Coltheart et al. (1975) make the point that ‘...what is needed if these issues are to be 
properly studied are tasks which are purely verbal or purely visuo-spatial’ (p. 439). They 
report an experiment in which subjects (n = 75) were asked to perform two tasks: in the 
first, they recalled the alphabet and reported to the experimenter the total number of letters 
which had an ‘ee’ sound, in the second, they recalled the alphabet but were instead asked 
to report the number of letters with a curve in the upper-case form. The authors argue that 
these were either purely verbal or purely visuo-spatial tasks and their results (namely that 
males perform better on the visuo-spatial task and females perform better on the verbal 
task) support the widely held beliefs of sex differences in spatial and verbal skills. 

The present paper describes our own attempt to replicate the results of Coltheart et al. 
(1975) on a sample of 100 undergraduate students. 


Method* 
Subjects 


Fifty male and 50 female nght-handed undergraduate students, from a variety of disciplines, with a 
mean age of 20 years and 1 month, took part. 


Procedure 


Each subject completed two tasks. The order of presentation of the tasks was randomized. Subjects 
are given the following verbal instructions by a female experimenter: 


Task 1—verbal task. ‘Please proceed mentally through the alphabet from A to Z counting the 
number of letters which contain the sound “ee” including E itself. Do this as quickly as you can and 
only speaking out loud when you have decided on the final solution. Please close your eyes and, if 


co aed 


necessary, count on your fingers. Proceed when I say “go”. 


Task 2—visuo-spatial task. ‘Please proceed mentally through the alphabet from A to Z counting the 
number of letters which have a curve in them Consider each letter as a capital letter as 1t would 


* Because this article reports a failure to replicate, our method is described in some detail. 
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appear in type-print rather than in your own hand-wnting. Do this as quickly as you can and only 
speaking out loud when you have decided on the final solution. Please close your eyes and, if 
necessary, count on your fingers. Proceed when I say “go”. 

Subjects were timed from the word ‘go’ until they uttered their solution. Their responsis were 
recorded without comment. 


Results 


Coltheart et al. (1975) report that the measure which best distinguished the males from the 
females on these tasks was the number of subjects making the correct response in each 
condition. Figure 1 shows a reproduction of their results as well as our own. There is a 
strong similarity in the results for the female population, but the results for the males are 
highly discrepant. In our study, more females than males were correct in both tasks, and 
this difference was more marked on the verbal task. However, these results do not yield a 
significant difference between the sexes (Verbal task: y? = 1-44, d.f. = 1, n.s.; Curves task: 
x? = 0:19, d.f. = 1, n.s.) 


100 


90 
80 


70 


30 


Percentage of subjects giving correct response 


20 





Visuo-spatial Verbal 


Task condition 


Figure 1. Percentage of subjects giving the correct response on each task: ——, present study; 
— — —, data from Coltheart et al. (1975). 


A certain similarity in performance between the sexes is further evidenced by an 
examination of errors involving ‘false alarms’ or ‘misses’. The actual number of ‘ee’ 
sounds in the alphabet is eight, and the number of curves is eleven; ‘false alarms’ occurred 
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when subjects overestimated the correct responses, and ‘misses’ occurred when they 
underestimated. The incidence of both types of error was calculated for both males and 
females and the data are presented in Table 1. There is an interesting tendency for subjects 
to underestimate the number of ‘ee’ sounds and to overestimate the number of curves, 
although it is not clear why this should be the case. 


Table 1. Frequency of errors in verbal and visuo-spatial tasks 








Verbal task Visuo-spatial task 
Response Males Females Response Males Females 
5 2 3 6 0 l 
6 5 5 7 0 0 
7 14 7 8 l 0 
82 22 28 9 1 3 
9 2 5 10 7 4 
10 3 0 118 21 23 
ll 2 0 12 11 10 
12 0 1 13 5 6 
13 0 0 14 2 1 
14 0 0 15 2 0 
15 0 1 16 0 l 
17 0 1 
Total 50 50 Total 50 50 


a Correct response. 


The most interesting result comes from the analysis of the times taken to complete the 
two tasks, the data for which are presented in Table 2. There is no significant difference 
between males and females in the time taken to complete the phonemic task (t = 0-074, 
d.f. = 98, n.s.); however, the times they take to complete the visuo-spatial tasks do differ 
significantly (females taking less time than males) (t = 2:34, d.f. = 98, P < 0-01, two-tailed). 


Table 2. Time taken (s) to complete verbal and visuo-spatial tasks 





Verbal task Visuo-spatial task 
Males 18-77 22:9 

(SD = 6:1) (SD = 5:6) 
Females 18-69 20°31 

(SD = 4-8) (SD = 5 5) 


Discussion 


It is tempting to argue that the failure to find statistically significant sex differences in 
accuracy on these tasks is due to the fact that males and females are indeed of comparable 
ability for simple verbal or visuo-spatial tasks. Yet, on closer consideration, the tasks are 
not as ‘purely verbal’ or ‘purely visuo-spatial’ as they were intended to be. Since no 
subject reported doing the visuo-spatial task without using initial phonemic recall (a 
process which in contrast to visual recall is overlearned during constant recitation in the 
pre- and primary school) both tasks have an initial verbal processing requirement, whereas 
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in the curve-search task phonemic recall is used to retrieve the visual respresentation of 
each letter, which is then processed for the existence of a curve. Furthermore, short-term 
memory is implicated in both tasks while the sequential processing takes place. The results 
concerning the time taken to complete the two tasks provide support for current models of 
hemispheric specialization, which suggest that females have bilateral representation of 
language skills whereas males do not. The reason that males take longer to complete the 
visuo-spatial task may be ascribed to their greater hemispheric specialization. Summarized, 
the argument runs as follows: 

(1) In women, phonemic recall of the alphabet is possible in the right hemisphere, ın men it 


is not (McGlone, 1977; Witelson, 1977; Hutt, 1979). 
(2) Deciding whether a letter has a curve in it can only be carried cut in the right 
hemisphere, and this is so for both sexes (Harris, 1977). 
(3) Phonemic recall is a necessary prior stage to the curve-judgement task. 

If all three of these statements are true, then the curve-judgement task will require 
inter-hemispheric transfer in men but not in women. We can therefore speculate that the 
extra time taken by men to complete the curve-search task is due to the time taken for 


inter-hemispheric transfer. 


The final issue which requires consideration is why our data should differ from those of 
Coltheart et al. (1975). Since a similar method was used in both experiments, it is likely 
that there was some important difference in the populations which were studied. The fact 
that Jorm (1979) has failed to replicate another experiment which Coltheart et al. reported 
simultaneously suggests that this may indeed be the case. 
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Tutorial Essays in Psychology. A Guide to Recent Advances. vol. 2. Edited by N.S. Sutherland. 
Hillsdale, N.J.: Lawrence Erlbaum. 1979. Pp. vii-- 161 (no price given). 


The second volume in this series contains four review essays: ‘Long delay learning’ by Bow Tong 
Lett, ‘Spatial Fourier analysis’ by Mark Georgeson, ‘Echoic storage’ by Dennis H. Holding and 
‘Analyzing memory by cuing’ by Gregory V. Jones. 

Bow Lett gives a brief resumé of work showing long delay learning in animals, notably learned 
flavour aversion, particularly for poisonous substances, and ‘home cage’ delayed learning of two 
types: (a) delayed response learning, in which a rat on a subsequent trial can make use of a 
discriminative cue given some time before on a previous trial, provided it is taken out of the 
experimental apparatus and kept elsewhere (usually in its home cage) between the trials; (b) delayed 
reward learning, in which the animal is removed from the apparatus to its home cage after making a 
discriminative response and is rewarded only when it is returned after a delay to the start box of the 
apparatus. 

Revusky (Bow Lett’s husband) attempted to explain long delay learning in all such cases by using 
two related ideas: (1) stimulus relevance — the events to be related, such as flavour and sickness, must 
be relevant to each other and irrelevant to other events during the delay period, otherwise the 
association will be subject to — (2) concurrent interference from the other events. 

On this view, home-cage delayed learning depends on events being associated by a particular form 
of stimulus relevance, viz. situational relevance — an event is more readily associated with events 
occurring in the same situation than with events occurring in other contexts. 

Bow Lett's own contribution is an attempt to explain long delay learning in terms of two different 
kinds of memory states (following Hebb). ‘Inactive’ memory is the holding of information in a fairly 
permanent store. In order to influence behaviour a memory must be brought back to the 'active' 
state by the influence of external or internal stimuli. Using evidence from consolidation experiments, 
she assumes that memories are subject to concurrent interference only while in the ‘active’ state, and 
she argues that the crucial feature of home-cage delay learning is that memories pass quickly into the 
inactive state to be reactivated later by cues from the test apparatus. In contrast, the evidence from 
flavour aversion studies leads her to suppose that flavour memories remain in the active state for 
much longer periods. They can therefore be influenced by concurrent interference, and the strength of 
the association will be determined by the amount of such interference. As Bow Lett points out, 
however, much of this remains speculation until there is better evidence on the nature of memory 
storage. 

Georgeson presents a very clear account of Fourier analysis of visual patterns, the measurement of 
contrast sensitivity functions and the evidence for the view that the visual system comprises a number 
of separate ‘channels’ each tuned to a different optimal-spatial frequency. This way of analysing the 
functions of the visual system raises doubts about many of the concepts used since the days of Hubel 
and Wiesel, such as the idea of individual cortical neurones acting as line detectors. A narrow line 
has a broad Fourier spectrum and hence would stimulate a wide range of channels sensitive to 
different spatial frequencies. Georgeson argues that spatial features such as line width are not detected 
but encoded by the overall pattern of responses in a multichannel system. He compares the work of 
the brain in spatial perception with the problem of solving a jigsaw puzzle. Each piece (hypercolumn) 
contains a detailed Fourier representation of one patch of the image, but only a rough indication of 
its location in relation to other pieces. The spatial arrangement is achieved not because each piece has 
precise spatial coordinates, but by the coherence which emerges from a set of matching operations. 

Holding writes a highly critical review of the evidence for echoic storage. Clearly, auditory signals 
must be integrated over time, but is there any clear evidence for a period of time during which 
sensory information is held in its original form in an echoic store? 

Plomp found a value of about 200 ms for the decay of auditory sensations arising from pulses of 
noise, and Massaro gave 250 ms as the time over which a delayed masking tone would affect pitch 
identification of a test tone. These values fail far short of other estimates of the time for echoic storage 
derived from experiments on dichotic listening, and experiments as precategorical acoustic storage 
using the 'suffix' effect. Holding considers that none of these experiments provides evidence for an 


562 Book reviews 


echoic store. It is not clear what happens to signals on the unattended ear in dichotic listening, but 
there is good reason to suppose that they are processed to a considerable extent. Experiments using 
the suffix effect have shown that this effect depends on the semantic properties of the suffix. This 
makes it very unlikely that precategorical acoustic storage is entirely echoic. In short, the pure echoic 
store may be a will-o’-the-wisp to entice and mislead the unwary. 

Jones reviews experimental work aimed at understanding the memory processes underlying 
common mental activities, such as identifying objects or recalling events. After a brief consideration 
of multi-component theories of memory and of depth-of-processing models, he deals with the use of 
cueing in investigating the recall of information having several different components. He makes a 
distinction between ‘intrinsic’ information encoded in the experiment itself, and ‘extrinsic’ 
information derived from outside the experimental situation. In his view, cues may serve to access 
either type of information, but in rather different ways. If a cue is used to access extrinsic knowledge, 
recall becomes a process akin to generation followed by recognition. On the other hand, when cues 
are used to access intrinsic information the results fit predictions from his version of the 
fragmentation theory of memory. 

In passing, he suggests that the amnesics investigated by Weiskrantz and Warrington may be 
suffering from an inability to use intrinsic information, whereas they are relatively unimpaired in 
using extrinsic knowledge in cued recall. 

Each of the four articles provides adequate, if not abundant, references, and the topics are well 
chosen to fulfil the editor’s intention of encouraging psychologists to look at new developments 
beyond the confines of their own immediate interests. 

ROY DAVIS 


Experimental Psychology: A Small-n Approach. By Paul W. Robinson & David F. Foster. London: 
Harper & Row. 1979 Pp. x+239. £2.85. 


The aim of this compact but ambitious book is, in the authors’ words, ‘to provide a text for the 
person interested in a relatively low-level explanation of single-subject experimentation in psychology’ 
(p. 1x). This category of intended reader most probably includes people who already have a 
reasonable grounding in ordinary, large-n research methodology, whether or not acquired in a 
specifically psychological context, and who wish to find out whether small-n approaches are for them, 
but without rupturing their information-processing channel. As such it will almost certainly find its 
uses, but its utility as a comprehensive introduction to the domain of single-subject research in 
applied settings is limited by the authors’ unexplained restriction of their scope to the realm of 
operant procedural technology. 

The book has three main parts. The initial two chapters deal with historical background, chapters 
3-6 with the specific characteristics of small-n experimentation, and the remainder with examples of 
small-n approaches in basic and applied research, including clinical, educational, industrial and other 
areas (police patrolling strategies, litter control, and biofeedback). 

In the historical chapters the authors rightly point out that small-n research was the norm during 
the first 50 years of psychology and was only displaced in the 1920s by Fisherian large-n 
methodology. This, the authors rightly point out, is a rather than the only valid method of obtaining 
objective psychological data. However, their account of the re-emergence of small-n methods in the 
past 20 years is surprisingly barren of any reference to the pioneering efforts of M. B. Shapiro, whose 
first studies using single-case methodology appeared in the early 1950s. By omitting Shapiro's seminal 
contribution to the hypothetico-deductive experimental investigation of individual psychiatric 
patients, Robinson & Foster would seem to commit the same sort of error as they would condemn in 
those who regard large-n research as the definitive way of doing psychology. Operant methods are a 
rather than the only valid method of obtaining objective psychological data about single cases. For 
asking certain types of questions — determining whether a particular manipulable variable has 
quantitative effects upon some designated, experimenter-chosen measure of stably emitted 
behaviour — operant technology is undoubtedly invaluable. But there are more interesting applied 
questions on earth than can be dreamt of 1n Robinson & Foster's philosophy. Specifically, they leave 
out the challenging how-questions. The reader of their account would emerge from his sojourn none 
the wiser that an investigator can also test hypotheses about mechanisms that may possibly be 
responsible for observed qualitative effects by single-case experimentation and then go on to establish 
the generality of intra-subject findings by a series of inter-subject replications. 
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If the book were to be used in teaching applied research methodology, it would certainly need to 
be supplemented by additional readings, including more examples of operant-type small-n studies in 
applied settings as well as papers representing the Shapiro tradition. As a self-instruction manual in 
the absence of guidance to a wider range of applied readings, the book, which 1s clearly written and 
well illustrated with numerous diagrams and graphs, would seem likely to enlighten members of its 
intended readership in a regrettably one-sided manner. 

VICKY RIPPERE 


Biofeedback—Principles and Practice for Clinicians. Edited by J. V. Basmajian. Baltimore, MD: 
Williams & Wilkins. 1979. Pp. x+ 282. $31.95 


The remarkable growth of biofeedback is reflected by the proliferation of volumes devoted to its 
applications — both experimental and clinical. Indeed one can legitimately inquire whether any new 
publication upon the subject contributes anything of importance to the existing literature. Even a 
cursory examination, however, will reassure the prospective reader that the present volume edited by 
Dr Basmajian has a well-defined objective that is both crucial and timely. It is offered as a ‘summary 
and explication of the best applications of biofeedback treatment techniques’, and is primarily 
concerned with the practicalities of biofeedback. 

One of the strengths of the volume is that, unlike many of its contemporaries, it is directed toward 
a circumscribed audience of clinical psychologists and others directly concerned in the administration 
of biofeedback procedures as clinical therapy. It restricts itself to consideration of but one aspect of 
the complex mosaic presented by biofeedback studies, namely current therapeutic applications. Its 
approach is entirely pragmatic, as is emphasized by the editor's preface and also by his choice of 
contributors, all of whom are actively engaged ın clinical research and therapies involving 
biofeedback. The explicit statement of its goals gives the volume considerable unity of purpose despite 
being a compilation of 21 chapters by different authors 

The contents of the contributions vary from the general and introductory, such as Basmajian's 
introduction to the principles and background of biofeedback, to the more specialized and restricted 
which constitute the greater part of the volume. Together they introduce to the reader the diverse set 
of disorders, both functional and organic, that are currently being treated with varying degrees of 
success by therapies that include at least in part biofeedback procedures. Thus individual 
contributions survey the use of biofeedback procedures for the treatment of muscular, neurological, 
cardiovascular, gastro-intestinal, psychosomatic and psychiatric disorders It is important to notice 
that few advocate the use of biofeedback in isolation as a ‘pure’ therapy. More often its use is 
advised as an adjunct to existing, traditional therapies within an integrated treatment approach This 
raises vital questions concerning assessment of the value of the biofeedback component for the 
effectiveness of the combined therapy. Disappomtingly these questions are afforded only minimal 
consideration. Similarly there 1s 1nsufficient discussion of one of the major issues confronting 
biofeedback therapies, namely how transfer of training from clinic to normal-life situations 18 best 
achieved. 

One of the problems facing the clinician is which response modality feedback to use for the 
treatment of various disorders. Ás one of the contributors (Stoyva) remarks ' EMG feedback has 
become the clinical workhorse of the biofeedback area' and the truth of his statement 1s exemplified 
by the emphasis EMG feedback receives in the present volume. Five of the contributions are solely, 
and several others at least partially, concerned with EMG feedback applications. Whilst this is an 
accurate indication of the current status of biofeedback it should be borne 1n mind that alternative 
response modalities are available and may prove more valuable in the future. Several contributors 
rightly emphasize the importance of adapting the particular procedures, whichever response modality 
is chosen, to the individual patient's needs, and also of careful patient selection. Biofeedback 
procedures have been applied indiscriminately in the past, which has contributed to the doubts about 
its therapeutic status. 

Since the orientation of this volume is toward the practicalities of biofeedback techniques ıt is 
entirely appropriate that it should contain two chapters devoted to the equipment needs for 
biofeedback as well as a further chapter upon the basic electronics needed by the clinician wishing to 
adopt biofeedback instrumentation These chapters are particularly useful for clinicians with little 
experience of biofeedback procedures. I was disappointed, however, to find that the potential role to 
be played by computers for all aspects of biofeedback therapies — assessment, diagnosis, control, and 
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analysis — was only briefly discussed. Computers must become an essential clinical tool in the future, 
and they are emerging as such even now. 

Any judgement of this volume must take account of its particular emphasis and purpose. As an 
introduction to the problems and possibilities of biofeedback procedures for clinical application it is a 
valuable addition to the literature. It will be a considerable aid to clinicians wishing to adopt 
biofeedback procedures, particularly those involved ın rehabilitation therapies. Other readers, 
however, may be disappointed by its failure to address important issues confronting biofeedback. 
KEITH PHILLIPS 


The Experimental Analysis of Behavior: A Biological Perspective. By E. Fantino & C. Logan. San 
‘Francisco, Calif.: W. H. Freeman. 1979. Pp. 559. £8.40. 


This textbook contains an unusual assortment of material. After an initial chapter introducing the 
topic of species differences in learning ability by reference to honey bees, there follows a history of 
learning theory up to Skinner; a single chapter covering habituation, sensitization and classical 
conditioning, five reviewing a wide range of research on operant conditioning; one giving a potted 
introduction to subprimate ethology; one on examples of learning in naturalistic environments; two 
reviewing invertebrate learning; and a final chapter aimed at integration of the foregoing. 

It is perhaps the first textbook attempting to exploit the recent rapprochement between laboratory 
workers on behaviour theory and ethologists, and it should be evident from the above that the 
contents are not only oddly ordered but potentially unbalanced in coverage, quite apart from the 
strangeness of the bedfellows. However, the book is admittedly trying to do something new, and 
must be judged by its success in integrating the material, since all its component parts are amply 
covered by other elementary books. 

The treatment of learning has, as implied by the title, a pronounced Skinnerian bias, which is 
evident in the chapter on history as well as from the distribution of topics outlined above. The 
coverage of habituation and, particularly, sensitization, is superior to that in other textbooks, though 
in places references to the literature are irritatingly absent (e.g. in the quite contentious account of 
dishabituation). The section on classical conditioning is seriously weak; the explanations of such 
important topics as correlation vs. contiguity in conditioning, and the Rescorla- Wagner theory, 
would not, I think, be clear to the average student. Major subjects of current interest, such as 
higher-order conditioning and conditioned inhibition, are hardly mentioned. Most theorists would 
now include the autoshaping effect — arguably of considerable importance to a ' biological 
perspective’ — under classical conditioning, but the orthodox Skinnerian views of the authors prevent 
them from doing so. Autoshaping is perversely introduced in the operant chapter, under ‘shaping’, 
and its Pavlovian features are only mentioned grudgingly and in passing. E 

As might be expected, the chapters on operant conditioning are far more satisfactory. They cover 
stimulus control, conditioned reinforcement, aversive learning and choice, and attempt to give a 
reasonable weight to theory. Issues are discussed in a clear and critical manner, with attention to 
methodological points. Experiments by the first author receive very frequent mention. 

Up to this point the 'biological perspective' has played little part, apart from a cursory mention of 
variations in the ease of training different responses for shock avoidance (indeed the ‘autoshaping’ 
theory of behavioural contrast is castigated for being too trendilv biological!) There is now an abrupt 
shift to ethology. The chapter reviewing ethological principles is *old-fashioned' and omits themes 
such as primate behaviour or sociobiology. The next describes research on several well-known 
‘naturalistic’ examples of learning — imprinting, song acquisition, foraging, and recognition of 
distasteful prey. Bird navigation is included for no very obvious reason. This section makes lively 
reading, but draws heavily on secondary sources, including Scientific American articles, by contrast 
with the operant conditioning section which conveys detail and expertise. While lip-service is paid to 
the notion that species differences in learning may be determined by ecological factors, little detailed 
evidence is mentioned, let alone given any critical assessment. Hardly any attempt is made to relate 
these findings to the preceding material on conditioning: indeed, search image formation and 
imprinting are described as forms of perceptual or exposure learning, terms which do not appear 
anywhere else in the book. 

The next two chapters on invertebrate learning seemed (to this non-expert reviewer) much better 
integrated, both in their own fairly wide-ranging coverage and in relation to the rest of the book. 
Methodological points first raised in the chapters on conditioning are well deployed here. Once again, 
the opportunity to discuss determinants of species differences is scarcely taken up. 

The burden of interconnecting the foregoing chapters, so diverse in their approach and their degree 
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of sophistication, falls on the last chapter. This first reviews some ways in which field and laboratory 
techniques may be modified to achieve greater comparability. The authors make much of the fact 
that experiments on foraging (Krebs) and food procurement (Collier) using operant techniques yield 
results similar to those of psychologists’ studies of concurrent and chain schedules, but that is hardly 
surprising given that the procedures were essentially identical in the various cases, except that Krebs 
employed wild-caught tits rather than laboratory animals. It might be more fruitful to examine 
thoroughly the difference between mathematical models of choice between concurrent schedules, 
treated at length in chapter 7, and (say) Charnov's model of optimal foraging, which receives only 
passing mention in chapter 13. 

The last point exemplifies the main characteristic of this book — its inherent bias towards the 
operant approach. A further example is the cursory final discussion in chapter 13 of the ‘constraints 
on learning’ debate, which seems concerned to defend operant studies without seriously considering 
whether different learning processes might operate in different species or situations. Similarly, in 
connection with Shettleworth’s experiments the possibility that Pavlovian conditioning might be 
responsible for interference with operant acquisition is treated bnefly and negatively, although there 
is other evidence in the literature for such effects. In short, the subtitle 4 Biological Perspective 1s 
misleading — the book primarily presents an operant psychologist’s perspective. Its attempt to 
juxtapose operant and ethological approaches, while interesting and useful as a basis for discussion, 
does not really succeed in allowing it to be read on its own as a balanced textbook. 

E. A. GAFFAN 


Training in Small Groups. Edited by B. Babington Smitth & B. A. Farrell. Oxford: Pergamon. 
Pp. xii 4- 144. £7.50. 


The literature on the use of training methods with small groups in Britain is not great ànd it 1s quite 
difficult to discover what recent developments have been going on here since the early days of 
T-groups, sensitivity training and leaderless groups. 

This book opens with chapters by five different people, each describing a different training 
procedure. These are all variants on providing minimum leadership to small groups and getting them 
to discover, through their own activities and reactions, useful techniques in coping with human 
relationships. Three of the approaches have a psycho-dynamic background. They emphasize the 
individual's welfare and improvement in his sensitivity to what goes on in groups. The other two 
concentrate on leadership or management tasks, and emphasize improvements in job performance. 
The quality of writing in these chapter varies Adair's on Sandhurst training in leadership 1s brief, 
clear and lively. Marcus' on therapeutic groups in Grendon Underwood psychiatric prison 1s 
thoughtfully presented and extremely interesting. The other three are all too long and wordy for what 
they have to say and at least one of them is so laboured in style as to have a decidedly soporific effect 
upon the hapless reader. It may be that these three writers do not wish to be too specific and simple 
ın their outlines, for they are all from sources which financially would not be likely to welcome the 
sincerest form of flattery. 

The five descriptions are followed by two crisp and clear analytic chapters, each by one of the 
editors. The first tackles some questions of logic and philosophy, looking at the claims made for the 
five schemes and deciding whether they are justified. We are left in no doubt that most of the claims 
are too sweeping; and also that those that are not may be self-fulfilling. However, the logic is too cut 
and dried to provide convincing reasons for rejecting the techniques earlier described; and even 
self-fulfilling systems may well allow people to handle situations and material better, and get jobs 
done faster, than before they had the systems available to them. The second of these chapters 
compares and contrasts the five approaches on nine variables of interest to psychologists These are 
such things as whether the emphasis is upon performance of the group as a unit, as a team, or as 
separate individuals; whether the group 1s pointed more towards studying the processes of its own 
interaction or the way 1t completes the specific tasks it faces; and whether theory or experience are 
more stressed. It is a most useful chapter, since it weighs up the different techniques in an impartial 
fashion It points out very sensibly that the reader will probably prefer a system which most nearly 
suits the task for which he is considering the training. 

In summary, then, this is a useful book for anyone who wishes to get some insight into some 
current methods of training people to handle small groups, and this is so mainly because of the two 
final chapters which should help to inculcate a balanced view of the aims and.achievements of those 
methods. 

SHEILA CHOWN 
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Basic Writings in the History of Psychology. By Robert I. Watson. New York: Oxford University 
Press. 1979. Pp. xviii+420. £5.95. 


This book consists of excerpts from 48 authors from Galileo to Skinner (the only one still living), 
preceded by brief commentaries which supply biographical details and state the topic to be discussed, 
and followed by even briefer evaluative paragraphs summarizing the main contribution of each 
author. Where several excerpts are taken from the same writer further commentaries arc interspersed, 
but in an insufficiently distinct typeface so that the reader is frequently pulled up short by them. 

A somewhat curious selection procedure was used, the criterion for inclusion of authors and topics 
being occurrence in at least three out of five history texts chosen for examination (namely, those by 
R. Lowry, D. N. Robinson, D. P. Schultz, R. I. Watson and M. Wertheimer). On what grounds was 
Boring's excluded? The resulting selection received some support from eminence ratings by 
contemporary historians of psychology reported in Annin e: al. (1968). The prime consideration is 
unashamedly admitted to be educational value: ‘What students need are excerpts that common 
opinion would have it were the most important, the most widely accepted as seminal. They may 
appear commonplace and conventional to persons already familiar with the history of psychology 
[they do!], but the book is not addressed to them.’ Given the authors, the excerpts are highly 
predictable. Locke on primary and secondary qualities, Galton on hereditary genius, William James 
on the stream of consciousness, memory, habit and the self, Dewey on the reflex arc, and Lashley on 
mass action, are typical. Watson ‘has resisted the scholarly impulse to rescue from oblivion 
heretofore unanthologized documents’ and makes no apology for the fact that ‘many, if not most, of 
the excerpts have been used in previous anthologies' (for example, those by Dennis, Diamond, Koch, 
Mandler & Mandler, and Murchison). Thus, we have no new translations, though judicious use has 
been made of summary articles (in the cases of Pavlov, von Ehrenfels, Kulpe, Kohler and Skinner). 

The arrangement followed is generally chronological so that there are some odd juxtapositions 
(Jung is followed by Lashley) and disjunctions (11 others separate Titchener from Wundt). The book 
begins with Galileo and Bacon on methodology. The next section reads much like a chapter on 
association but with the customary balance between text and quotation reversed. Rationalistic 
thought is introduced through Descartes and Kant. Next come 19th century physiologists and 
biologists. Then follow representatives of structuralism, functionalism, Gestalt psychology, 
psychoanalysis and behaviouristic learning theory. Indeed, it might be said that some of these are 
overrepresented: Gestalt psychology by Wetheimer, Kohler, Koffka and Lewin; psychoanalysis by 
Freud, Adler and Jung; and behaviouristic learning theory by no fewer than Thorndike, Pavlov, 
Watson, Tolman, Guthrie, Hull and Skinner. The most obvious omission (apparently resulting from 
the selection procedure) is anything earlier than the 16th century — Aristotle. for example. It is also 
rather thin on 1diographic and differential approaches: there is nothing from Allport, Binet, 
Spearman, Piaget or Rogers. 

The commentaries are straightforward and somewhat simplistic: for example, ' Hobbes is best 
known as a social philosopher and thus as a precursor of social psychology’. In other cases they are 
misleading. The impressions are given that Berkeley was a mentakst rather than a subjective idealist, 
that classical psychophysical methods are fundamental to sensory measurement today, that von 
Ehrenfels’ form qualities did not represent a new tradition, and that the nervous system is 
unobservable. More seriously, the evaluative summaries go little beyond stating consonance or 
dissonance between historical and current views. The treatment remains largely uncritical. We are 
told that vitalism gave way to mechanism; that Romanes’ methods were anecdotal, inferential and 
analogical; and that McDougall's conceptual framework of instinct is no longer fashionable; but no 
critical analysis of the weaknesses of these positions is offered. 

One may query the value of such an enterprise as this on two grounds. Firstly, there is the decline 
in the study of the history of psychology and recourse to original sources. Watson's frequent 
remarking, with apparent surprise, that some idea has persisted 'to this very day' suggests that even 
he is somewhat taken aback to discover the relevance of historical excerpts to current problems. 
Secondly, at least half a dozen similar anthologies are already in existence. (It is unlikely that the 
existence of the present one is unrelated to Watson's publication of a biographical guide to the 
history of psychology last year.) The lack of contextual background, thematic organization and 
critical analysis make this volume inadequate for undergraduates in anything other than a 
supplementary capacity. Having said that, the attractiveness of the present selection, which is 
interesting, entertaining and highly readable, and its succinctness recommend it as a popular choice 
among its competitors. If it merely served to whet students' appetites for further exploration it would 
have served a worthy purpose, but it is somewhat doubtful in the current climate that its pages will 


ever be opened. E. R. VALENTINE 
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The British Journal of Clinical Psychology 


The new British Journal of Clinical Psychology builds upon the foundations laid over two 
decades by the clinical section of the British Journal of Social and Clinical Psychology. It 
publishes new findings, theoretical, methodological and review papers bearing on the whole field 
of clinical psychology, and includes: 


x descriptive and aetiological studies of psychopathology 

« studies of the assessment and treatment of psychological disorders 

* applications of psychology to medicine and health care 

« social and organizational aspects of psychological disorder and ill-health 


The journal’s contributors, editorial consultants and readership are international. Contributions 
combine scientific excellence with relevance to the concerns of practising clinicians, bndging the 
gap between academic and professional interests. Thorough empirical work with clinical and 
community populations is especially prominent. 
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* correspondence 
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